WorldWideScience

Sample records for cluster analyses based

  1. Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.

    Science.gov (United States)

    Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

    2018-01-01

    In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.

  2. Analyses of Crime Patterns in NIBRS Data Based on a Novel Graph Theory Clustering Method: Virginia as a Case Study

    Directory of Open Access Journals (Sweden)

    Peixin Zhao

    2014-01-01

    Full Text Available This paper suggests a novel clustering method for analyzing the National Incident-Based Reporting System (NIBRS data, which include the determination of correlation of different crime types, the development of a likelihood index for crimes to occur in a jurisdiction, and the clustering of jurisdictions based on crime type. The method was tested by using the 2005 assault data from 121 jurisdictions in Virginia as a test case. The analyses of these data show that some different crime types are correlated and some different crime parameters are correlated with different crime types. The analyses also show that certain jurisdictions within Virginia share certain crime patterns. This information assists with constructing a pattern for a specific crime type and can be used to determine whether a jurisdiction may be more likely to see this type of crime occur in their area.

  3. Group analyses of connectivity-based cortical parcellation using repeated k-means clustering.

    Science.gov (United States)

    Nanetti, Luca; Cerliani, Leonardo; Gazzola, Valeria; Renken, Remco; Keysers, Christian

    2009-10-01

    K-means clustering has become a popular tool for connectivity-based cortical segmentation using Diffusion Weighted Imaging (DWI) data. A sometimes ignored issue is, however, that the output of the algorithm depends on the initial placement of starting points, and that different sets of starting points therefore could lead to different solutions. In this study we explore this issue. We apply k-means clustering a thousand times to the same DWI dataset collected in 10 individuals to segment two brain regions: the SMA-preSMA on the medial wall, and the insula. At the level of single subjects, we found that in both brain regions, repeatedly applying k-means indeed often leads to a variety of rather different cortical based parcellations. By assessing the similarity and frequency of these different solutions, we show that approximately 256 k-means repetitions are needed to accurately estimate the distribution of possible solutions. Using nonparametric group statistics, we then propose a method to employ the variability of clustering solutions to assess the reliability with which certain voxels can be attributed to a particular cluster. In addition, we show that the proportion of voxels that can be attributed significantly to either cluster in the SMA and preSMA is relatively higher than in the insula and discuss how this difference may relate to differences in the anatomy of these regions.

  4. Group analyses of connectivity-based cortical parcellation using repeated k-means clustering

    NARCIS (Netherlands)

    Nanetti, Luca; Cerliani, Leonardo; Gazzola, Valeria; Renken, Remco; Keysers, Christian

    2009-01-01

    K-means clustering has become a popular tool for connectivity-based cortical segmentation using Diffusion Weighted Imaging (DWI) data. A sometimes ignored issue is, however, that the output of the algorithm depends on the initial placement of starting points, and that different sets of starting

  5. A Web service substitution method based on service cluster nets

    Science.gov (United States)

    Du, YuYue; Gai, JunJing; Zhou, MengChu

    2017-11-01

    Service substitution is an important research topic in the fields of Web services and service-oriented computing. This work presents a novel method to analyse and substitute Web services. A new concept, called a Service Cluster Net Unit, is proposed based on Web service clusters. A service cluster is converted into a Service Cluster Net Unit. Then it is used to analyse whether the services in the cluster can satisfy some service requests. Meanwhile, the substitution methods of an atomic service and a composite service are proposed. The correctness of the proposed method is proved, and the effectiveness is shown and compared with the state-of-the-art method via an experiment. It can be readily applied to e-commerce service substitution to meet the business automation needs.

  6. Canonical PSO Based K-Means Clustering Approach for Real Datasets.

    Science.gov (United States)

    Dey, Lopamudra; Chakraborty, Sanjay

    2014-01-01

    "Clustering" the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.

  7. A Data-origin Authentication Protocol Based on ONOS Cluster

    Directory of Open Access Journals (Sweden)

    Qin Hua

    2016-01-01

    Full Text Available This paper is aim to propose a data-origin authentication protocol based on ONOS cluster. ONOS is a SDN controller which can work under a distributed environment. However, the security of an ONOS cluster is seldom considered, and the communication in an ONOS cluster may suffer from lots of security threats. In this paper, we used a two-tier self-renewable hash chain for identity authentication and data-origin authentication. We analyse the security and overhead of our proposal and made a comparison with current security measure. It showed that with the help of our proposal, communication in an ONOS cluster could be protected from identity forging, replay attacks, data tampering, MITM attacks and repudiation, also the computational overhead would decrease apparently.

  8. A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.

    Science.gov (United States)

    Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia

    2017-10-15

    In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age  = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.

  9. Voting-based consensus clustering for combining multiple clusterings of chemical structures

    Directory of Open Access Journals (Sweden)

    Saeed Faisal

    2012-12-01

    Full Text Available Abstract Background Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results The cumulative voting-based aggregation algorithm (CVAA, cluster-based similarity partitioning algorithm (CSPA and hyper-graph partitioning algorithm (HGPA were examined. The F-measure and Quality Partition Index method (QPI were used to evaluate the clusterings and the results were compared to the Ward’s clustering method. The MDL Drug Data Report (MDDR dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward’s method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward’s method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA was the method of choice among consensus clustering methods.

  10. Text Clustering Algorithm Based on Random Cluster Core

    Directory of Open Access Journals (Sweden)

    Huang Long-Jun

    2016-01-01

    Full Text Available Nowadays clustering has become a popular text mining algorithm, but the huge data can put forward higher requirements for the accuracy and performance of text mining. In view of the performance bottleneck of traditional text clustering algorithm, this paper proposes a text clustering algorithm with random features. This is a kind of clustering algorithm based on text density, at the same time using the neighboring heuristic rules, the concept of random cluster is introduced, which effectively reduces the complexity of the distance calculation.

  11. Normalization based K means Clustering Algorithm

    OpenAIRE

    Virmani, Deepali; Taneja, Shweta; Malhotra, Geetika

    2015-01-01

    K-means is an effective clustering technique used to separate similar data into groups based on initial centroids of clusters. In this paper, Normalization based K-means clustering algorithm(N-K means) is proposed. Proposed N-K means clustering algorithm applies normalization prior to clustering on the available data as well as the proposed approach calculates initial centroids based on weights. Experimental results prove the betterment of proposed N-K means clustering algorithm over existing...

  12. Progressive Exponential Clustering-Based Steganography

    Directory of Open Access Journals (Sweden)

    Li Yue

    2010-01-01

    Full Text Available Cluster indexing-based steganography is an important branch of data-hiding techniques. Such schemes normally achieve good balance between high embedding capacity and low embedding distortion. However, most cluster indexing-based steganographic schemes utilise less efficient clustering algorithms for embedding data, which causes redundancy and leaves room for increasing the embedding capacity further. In this paper, a new clustering algorithm, called progressive exponential clustering (PEC, is applied to increase the embedding capacity by avoiding redundancy. Meanwhile, a cluster expansion algorithm is also developed in order to further increase the capacity without sacrificing imperceptibility.

  13. Scalable Density-Based Subspace Clustering

    DEFF Research Database (Denmark)

    Müller, Emmanuel; Assent, Ira; Günnemann, Stephan

    2011-01-01

    For knowledge discovery in high dimensional databases, subspace clustering detects clusters in arbitrary subspace projections. Scalability is a crucial issue, as the number of possible projections is exponential in the number of dimensions. We propose a scalable density-based subspace clustering...... method that steers mining to few selected subspace clusters. Our novel steering technique reduces subspace processing by identifying and clustering promising subspaces and their combinations directly. Thereby, it narrows down the search space while maintaining accuracy. Thorough experiments on real...... and synthetic databases show that steering is efficient and scalable, with high quality results. For future work, our steering paradigm for density-based subspace clustering opens research potential for speeding up other subspace clustering approaches as well....

  14. Substructures in DAFT/FADA survey clusters based on XMM and optical data

    Science.gov (United States)

    Durret, F.; DAFT/FADA Team

    2014-07-01

    The DAFT/FADA survey was initiated to perform weak lensing tomography on a sample of 90 massive clusters in the redshift range [0.4,0.9] with HST imaging available. The complementary deep multiband imaging constitutes a high quality imaging data base for these clusters. In X-rays, we have analysed the XMM-Newton and/or Chandra data available for 32 clusters, and for 23 clusters we fit the X-ray emissivity with a beta-model and subtract it to search for substructures in the X-ray gas. This study was coupled with a dynamical analysis for the 18 clusters with at least 15 spectroscopic galaxy redshifts in the cluster range, based on a Serna & Gerbal (SG) analysis. We detected ten substructures in eight clusters by both methods (X-rays and SG). The percentage of mass included in substructures is found to be roughly constant with redshift, with values of 5-15%. Most of the substructures detected both in X-rays and with the SG method are found to be relatively recent infalls, probably at their first cluster pericenter approach.

  15. IDENTIFICAÇÃO DE CLUSTERS INTERNACIONAIS COM BASE NAS DIMENSÕES CULTURAIS DE HOFSTEDE. / Identification of international clusters based on the hofstede’s cultural dimensions

    Directory of Open Access Journals (Sweden)

    Valderí de Castro Alcântara1

    2012-08-01

    Full Text Available Haja vista que a cultura de um país influencia a cultura organizacional das empresas nele presente e ainda é fator determinante no processo de internacionalização, torna-se relevante compreender e mensurar as características culturais de cada país. Os estudos de Hofstede (1984 apresentam uma metodologia útil para comparação entre culturas. Tal metodologia leva em consideração as características deuma cultura que possibilita diferenciar um país de outro. Dessa forma, é possível observar que determinados países compartilham certos traços culturais e, assim, é possível agrupá-los segundo critérios pré-estabelecidos. O presente trabalho objetiva utilizar-se de procedimentos estatísticos multivariados Clusters Analyses, K-Means Cluster Analysis e Análise Discriminante para determinar e validar agrupamentos de países, com base nas dimensões culturais de Hofstede (Distance Index, Individualism, Masculinity e Uncertainty Avoidance Index. Os resultados determinaram quatro clusters: Cluster 1 - países com cultura masculina e individualista; Cluster 2 - cultura coletivista e aversa à incerteza; Cluster 3 - cultura feminina e com baixa distância hierárquica; e Cluster 4 - cultura com elevada distância hierárquica e propensão à incerteza./ Considering that the culture of a country influences the organizational culture of this company and it is still a determining factor in the internationalization process becomes important to understand and measure the cultural characteristics of each country. The studies of Hofstede (1984 present a useful methodology for comparing cultures, this methodology takes into account the characteristics of a culturethat allows to differentiate one from another country. Thus one can observe that certain countries share certain cultural traits and so it is possible grouping them according to predetermined criteria. The present work aims to utilize multivariate statistical procedures Cluster Analyses

  16. Radiobiological analyse based on cell cluster models

    International Nuclear Information System (INIS)

    Lin Hui; Jing Jia; Meng Damin; Xu Yuanying; Xu Liangfeng

    2010-01-01

    The influence of cell cluster dimension on EUD and TCP for targeted radionuclide therapy was studied using the radiobiological method. The radiobiological features of tumor with activity-lack in core were evaluated and analyzed by associating EUD, TCP and SF.The results show that EUD will increase with the increase of tumor dimension under the activity homogeneous distribution. If the extra-cellular activity was taken into consideration, the EUD will increase 47%. Under the activity-lack in tumor center and the requirement of TCP=0.90, the α cross-fire influence of 211 At could make up the maximum(48 μm)3 activity-lack for Nucleus source, but(72 μm)3 for Cytoplasm, Cell Surface, Cell and Voxel sources. In clinic,the physician could prefer the suggested dose of Cell Surface source in case of the future of local tumor control for under-dose. Generally TCP could well exhibit the effect difference between under-dose and due-dose, but not between due-dose and over-dose, which makes TCP more suitable for the therapy plan choice. EUD could well exhibit the difference between different models and activity distributions,which makes it more suitable for the research work. When the user uses EUD to study the influence of activity inhomogeneous distribution, one should keep the consistency of the configuration and volume of the former and the latter models. (authors)

  17. Spectroscopic Analyses of Neutron Capture Elements in Open Clusters

    Science.gov (United States)

    O'Connell, Julia E.

    The evolution of elements as a function or age throughout the Milky Way disk provides strong constraints for galaxy evolution models, and on star formation epochs. In an effort to provide such constraints, we conducted an investigation into r- and s-process elemental abundances for a large sample of open clusters as part of an optical follow-up to the SDSS-III/APOGEE-1 near infrared survey. To obtain data for neutron capture abundance analysis, we conducted a long-term observing campaign spanning three years (2013-2016) using the McDonald Observatory Otto Struve 2.1-meter telescope and Sandiford Cass Echelle Spectrograph (SES, R(lambda/Deltalambda) ˜60,000). The SES provides a wavelength range of ˜1400 A, making it uniquely suited to investigate a number of other important chemical abundances as well as the neutron capture elements. For this study, we derive abundances for 18 elements covering four nucleosynthetic families- light, iron-peak, neutron capture and alpha-elements- for ˜30 open clusters within 6 kpc of the Sun with ages ranging from ˜80 Myr to ˜10 Gyr. Both equivalent width (EW) measurements and spectral synthesis methods were employed to derive abundances for all elements. Initial estimates for model stellar atmospheres- effective temperature and surface gravity- were provided by the APOGEE data set, and then re-derived for our optical spectra by removing abundance trends as a function of excitation potential and reduced width log(EW/lambda). With the exception of Ba II and Zr I, abundance analyses for all neutron capture elements were performed by generating synthetic spectra from the new stellar parameters. In order to remove molecular contamination, or blending from nearby atomic features, the synthetic spectra were modeled by a best-fit Gaussian to the observed data. Nd II shows a slight enhancement in all cluster stars, while other neutron capture elements follow solar abundance trends. Ba II shows a large cluster-to-cluster abundance spread

  18. Membership determination of open clusters based on a spectral clustering method

    Science.gov (United States)

    Gao, Xin-Hua

    2018-06-01

    We present a spectral clustering (SC) method aimed at segregating reliable members of open clusters in multi-dimensional space. The SC method is a non-parametric clustering technique that performs cluster division using eigenvectors of the similarity matrix; no prior knowledge of the clusters is required. This method is more flexible in dealing with multi-dimensional data compared to other methods of membership determination. We use this method to segregate the cluster members of five open clusters (Hyades, Coma Ber, Pleiades, Praesepe, and NGC 188) in five-dimensional space; fairly clean cluster members are obtained. We find that the SC method can capture a small number of cluster members (weak signal) from a large number of field stars (heavy noise). Based on these cluster members, we compute the mean proper motions and distances for the Hyades, Coma Ber, Pleiades, and Praesepe clusters, and our results are in general quite consistent with the results derived by other authors. The test results indicate that the SC method is highly suitable for segregating cluster members of open clusters based on high-precision multi-dimensional astrometric data such as Gaia data.

  19. Efficient clustering aggregation based on data fragments.

    Science.gov (United States)

    Wu, Ou; Hu, Weiming; Maybank, Stephen J; Zhu, Mingliang; Li, Bing

    2012-06-01

    Clustering aggregation, known as clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a single better clustering. Existing clustering aggregation algorithms are applied directly to data points, in what is referred to as the point-based approach. The algorithms are inefficient if the number of data points is large. We define an efficient approach for clustering aggregation based on data fragments. In this fragment-based approach, a data fragment is any subset of the data that is not split by any of the clustering results. To establish the theoretical bases of the proposed approach, we prove that clustering aggregation can be performed directly on data fragments under two widely used goodness measures for clustering aggregation taken from the literature. Three new clustering aggregation algorithms are described. The experimental results obtained using several public data sets show that the new algorithms have lower computational complexity than three well-known existing point-based clustering aggregation algorithms (Agglomerative, Furthest, and LocalSearch); nevertheless, the new algorithms do not sacrifice the accuracy.

  20. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  1. Robust MST-Based Clustering Algorithm.

    Science.gov (United States)

    Liu, Qidong; Zhang, Ruisheng; Zhao, Zhili; Wang, Zhenghai; Jiao, Mengyao; Wang, Guangjing

    2018-06-01

    Minimax similarity stresses the connectedness of points via mediating elements rather than favoring high mutual similarity. The grouping principle yields superior clustering results when mining arbitrarily-shaped clusters in data. However, it is not robust against noises and outliers in the data. There are two main problems with the grouping principle: first, a single object that is far away from all other objects defines a separate cluster, and second, two connected clusters would be regarded as two parts of one cluster. In order to solve such problems, we propose robust minimum spanning tree (MST)-based clustering algorithm in this letter. First, we separate the connected objects by applying a density-based coarsening phase, resulting in a low-rank matrix in which the element denotes the supernode by combining a set of nodes. Then a greedy method is presented to partition those supernodes through working on the low-rank matrix. Instead of removing the longest edges from MST, our algorithm groups the data set based on the minimax similarity. Finally, the assignment of all data points can be achieved through their corresponding supernodes. Experimental results on many synthetic and real-world data sets show that our algorithm consistently outperforms compared clustering algorithms.

  2. STATUS OF THE LINUX PC CLUSTER FOR BETWEEN-PULSE DATA ANALYSES AT DIII-D

    International Nuclear Information System (INIS)

    PENG, Q; GROEBNER, R.J; LAO, L.L; SCHACHTER, J.; SCHISSEL, D.P; WADE, M.R.

    2001-08-01

    OAK-B135 Some analyses that survey experimental data are carried out at a sparse sample rate between pulses during tokamak operation and/or completed as a batch job overnight because the complete analysis on a single fast workstation cannot fit in the narrow time window between two pulses. Scientists therefore miss the opportunity to use these results to guide experiments quickly. With a dedicated Beowulf type cluster at a cost less than that of a workstation, these analyses can be accomplished between pulses and the analyzed data made available for the research team during the tokamak operation. A Linux PC cluster comprises of 12 processors was installed at DIII-D National Fusion Facility in CY00 and expanded to 24 processors in CY01 to automatically perform between-pulse magnetic equilibrium reconstructions using the EFIT code written in Fortran, CER analyses using CERQUICK code written in IDL and full profile fitting analyses (n e , T e , T i , V r , Z eff ) using IDL code ZIPFIT. This paper reports the current status of the system and discusses some problems and concerns raised during the implementation and expansion of the system

  3. CC_TRS: Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life

    Directory of Open Access Journals (Sweden)

    Musaab Riyadh

    2017-01-01

    Full Text Available The rapid spreading of positioning devices leads to the generation of massive spatiotemporal trajectories data. In some scenarios, spatiotemporal data are received in stream manner. Clustering of stream data is beneficial for different applications such as traffic management and weather forecasting. In this article, an algorithm for Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life is proposed. The algorithm consists of two phases. There is the online phase where temporal micro clusters are used to store summarized spatiotemporal information for each group of similar segments. The clustering task in online phase is based on temporal micro cluster lifetime instead of time window technique which divides stream data into time bins and clusters each bin separately. For offline phase, a density based clustering approach is used to generate macro clusters depending on temporal micro clusters. The evaluation of the proposed algorithm on real data sets shows the efficiency and the effectiveness of the proposed algorithm and proved it is efficient alternative to time window technique.

  4. Permutation Tests of Hierarchical Cluster Analyses of Carrion Communities and Their Potential Use in Forensic Entomology.

    Science.gov (United States)

    van der Ham, Joris L

    2016-05-19

    Forensic entomologists can use carrion communities' ecological succession data to estimate the postmortem interval (PMI). Permutation tests of hierarchical cluster analyses of these data provide a conceptual method to estimate part of the PMI, the post-colonization interval (post-CI). This multivariate approach produces a baseline of statistically distinct clusters that reflect changes in the carrion community composition during the decomposition process. Carrion community samples of unknown post-CIs are compared with these baseline clusters to estimate the post-CI. In this short communication, I use data from previously published studies to demonstrate the conceptual feasibility of this multivariate approach. Analyses of these data produce series of significantly distinct clusters, which represent carrion communities during 1- to 20-day periods of the decomposition process. For 33 carrion community samples, collected over an 11-day period, this approach correctly estimated the post-CI within an average range of 3.1 days. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  5. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  6. BioCluster: Tool for Identification and Clustering of Enterobacteriaceae Based on Biochemical Data

    Directory of Open Access Journals (Sweden)

    Ahmed Abdullah

    2015-06-01

    Full Text Available Presumptive identification of different Enterobacteriaceae species is routinely achieved based on biochemical properties. Traditional practice includes manual comparison of each biochemical property of the unknown sample with known reference samples and inference of its identity based on the maximum similarity pattern with the known samples. This process is labor-intensive, time-consuming, error-prone, and subjective. Therefore, automation of sorting and similarity in calculation would be advantageous. Here we present a MATLAB-based graphical user interface (GUI tool named BioCluster. This tool was designed for automated clustering and identification of Enterobacteriaceae based on biochemical test results. In this tool, we used two types of algorithms, i.e., traditional hierarchical clustering (HC and the Improved Hierarchical Clustering (IHC, a modified algorithm that was developed specifically for the clustering and identification of Enterobacteriaceae species. IHC takes into account the variability in result of 1–47 biochemical tests within this Enterobacteriaceae family. This tool also provides different options to optimize the clustering in a user-friendly way. Using computer-generated synthetic data and some real data, we have demonstrated that BioCluster has high accuracy in clustering and identifying enterobacterial species based on biochemical test data. This tool can be freely downloaded at http://microbialgen.du.ac.bd/biocluster/.

  7. Projection-based curve clustering

    International Nuclear Information System (INIS)

    Auder, Benjamin; Fischer, Aurelie

    2012-01-01

    This paper focuses on unsupervised curve classification in the context of nuclear industry. At the Commissariat a l'Energie Atomique (CEA), Cadarache (France), the thermal-hydraulic computer code CATHARE is used to study the reliability of reactor vessels. The code inputs are physical parameters and the outputs are time evolution curves of a few other physical quantities. As the CATHARE code is quite complex and CPU time-consuming, it has to be approximated by a regression model. This regression process involves a clustering step. In the present paper, the CATHARE output curves are clustered using a k-means scheme, with a projection onto a lower dimensional space. We study the properties of the empirically optimal cluster centres found by the clustering method based on projections, compared with the 'true' ones. The choice of the projection basis is discussed, and an algorithm is implemented to select the best projection basis among a library of orthonormal bases. The approach is illustrated on a simulated example and then applied to the industrial problem. (authors)

  8. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  9. REGIONAL DEVELOPMENT BASED ON CLUSTER IN LIVESTOCK DEVELOPMENT. CLUSTER IN LIVESTOCK SECTOR IN THE KYRGYZ REPUBLIC

    Directory of Open Access Journals (Sweden)

    Meerim SYDYKOVA

    2014-11-01

    Full Text Available In most developing countries, where agriculture is the main economical source, clusters have been found as a booster to develop their economy. The Asian countries are now starting to implement agro-food clusters into the mainstream of changes in agriculture, farming and food industry. The long-term growth of meat production in the Kyrgyz Republic during the last decade, as well as the fact that agriculture has become one of the prioritized sectors of the economy, proved the importance of livestock sector in the economy of the Kyrgyz Republic. The research question is “Does the Kyrgyz Republic has strong economic opportunities and prerequisites in agriculture in order to implement an effective agro cluster in the livestock sector?” Paper focuses on describing the prerequisites of the Kyrgyz Republic in agriculture to implement livestock cluster. The main objective of the paper is to analyse the livestock sector of the Kyrgyz Republic and observe the capacity of this sector to implement agro-cluster. The study focuses on investigating livestock sector and a complex S.W.O.T. The analysis was carried out based on local and regional database and official studies. The results of research demonstrate the importance of livestock cluster for national economy. It can be concluded that cluster implementation could provide to its all members with benefits if they could build strong collaborative relationship in order to facilitate the access to the labour market and implicitly, the access to exchange of good practices. Their ability of potential cluster members to act as a convergence pole is critical for acquiring practical skills necessary for the future development of the livestock sector.

  10. Classical Music Clustering Based on Acoustic Features

    OpenAIRE

    Wang, Xindi; Haque, Syed Arefinul

    2017-01-01

    In this paper we cluster 330 classical music pieces collected from MusicNet database based on their musical note sequence. We use shingling and chord trajectory matrices to create signature for each music piece and performed spectral clustering to find the clusters. Based on different resolution, the output clusters distinctively indicate composition from different classical music era and different composing style of the musicians.

  11. Cluster Based Text Classification Model

    DEFF Research Database (Denmark)

    Nizamani, Sarwat; Memon, Nasrullah; Wiil, Uffe Kock

    2011-01-01

    We propose a cluster based classification model for suspicious email detection and other text classification tasks. The text classification tasks comprise many training examples that require a complex classification model. Using clusters for classification makes the model simpler and increases...... the accuracy at the same time. The test example is classified using simpler and smaller model. The training examples in a particular cluster share the common vocabulary. At the time of clustering, we do not take into account the labels of the training examples. After the clusters have been created......, the classifier is trained on each cluster having reduced dimensionality and less number of examples. The experimental results show that the proposed model outperforms the existing classification models for the task of suspicious email detection and topic categorization on the Reuters-21578 and 20 Newsgroups...

  12. Weighted voting-based consensus clustering for chemical structure databases

    Science.gov (United States)

    Saeed, Faisal; Ahmed, Ali; Shamsir, Mohd Shahir; Salim, Naomie

    2014-06-01

    The cluster-based compound selection is used in the lead identification process of drug discovery and design. Many clustering methods have been used for chemical databases, but there is no clustering method that can obtain the best results under all circumstances. However, little attention has been focused on the use of combination methods for chemical structure clustering, which is known as consensus clustering. Recently, consensus clustering has been used in many areas including bioinformatics, machine learning and information theory. This process can improve the robustness, stability, consistency and novelty of clustering. For chemical databases, different consensus clustering methods have been used including the co-association matrix-based, graph-based, hypergraph-based and voting-based methods. In this paper, a weighted cumulative voting-based aggregation algorithm (W-CVAA) was developed. The MDL Drug Data Report (MDDR) benchmark chemical dataset was used in the experiments and represented by the AlogP and ECPF_4 descriptors. The results from the clustering methods were evaluated by the ability of the clustering to separate biologically active molecules in each cluster from inactive ones using different criteria, and the effectiveness of the consensus clustering was compared to that of Ward's method, which is the current standard clustering method in chemoinformatics. This study indicated that weighted voting-based consensus clustering can overcome the limitations of the existing voting-based methods and improve the effectiveness of combining multiple clusterings of chemical structures.

  13. Comparing clustering models in bank customers: Based on Fuzzy relational clustering approach

    Directory of Open Access Journals (Sweden)

    Ayad Hendalianpour

    2016-11-01

    Full Text Available Clustering is absolutely useful information to explore data structures and has been employed in many places. It organizes a set of objects into similar groups called clusters, and the objects within one cluster are both highly similar and dissimilar with the objects in other clusters. The K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms are the most popular clustering algorithms for their easy implementation and fast work, but in some cases we cannot use these algorithms. Regarding this, in this paper, a hybrid model for customer clustering is presented that is applicable in five banks of Fars Province, Shiraz, Iran. In this way, the fuzzy relation among customers is defined by using their features described in linguistic and quantitative variables. As follows, the customers of banks are grouped according to K-mean, C-mean, Fuzzy C-mean and Kernel K-mean algorithms and the proposed Fuzzy Relation Clustering (FRC algorithm. The aim of this paper is to show how to choose the best clustering algorithms based on density-based clustering and present a new clustering algorithm for both crisp and fuzzy variables. Finally, we apply the proposed approach to five datasets of customer's segmentation in banks. The result of the FCR shows the accuracy and high performance of FRC compared other clustering methods.

  14. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

    Science.gov (United States)

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

    2015-01-01

    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  15. INTERSECTION DETECTION BASED ON QUALITATIVE SPATIAL REASONING ON STOPPING POINT CLUSTERS

    Directory of Open Access Journals (Sweden)

    S. Zourlidou

    2016-06-01

    Full Text Available The purpose of this research is to propose and test a method for detecting intersections by analysing collectively acquired trajectories of moving vehicles. Instead of solely relying on the geometric features of the trajectories, such as heading changes, which may indicate turning points and consequently intersections, we extract semantic features of the trajectories in form of sequences of stops and moves. Under this spatiotemporal prism, the extracted semantic information which indicates where vehicles stop can reveal important locations, such as junctions. The advantage of the proposed approach in comparison with existing turning-points oriented approaches is that it can detect intersections even when not all the crossing road segments are sampled and therefore no turning points are observed in the trajectories. The challenge with this approach is that first of all, not all vehicles stop at the same location – thus, the stop-location is blurred along the direction of the road; this, secondly, leads to the effect that nearby junctions can induce similar stop-locations. As a first step, a density-based clustering is applied on the layer of stop observations and clusters of stop events are found. Representative points of the clusters are determined (one per cluster and in a last step the existence of an intersection is clarified based on spatial relational cluster reasoning, with which less informative geospatial clusters, in terms of whether a junction exists and where its centre lies, are transformed in more informative ones. Relational reasoning criteria, based on the relative orientation of the clusters with their adjacent ones are discussed for making sense of the relation that connects them, and finally for forming groups of stop events that belong to the same junction.

  16. Using Cluster Analysis to Compartmentalize a Large Managed Wetland Based on Physical, Biological, and Climatic Geospatial Attributes.

    Science.gov (United States)

    Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael

    2018-04-27

    Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.

  17. A Novel Cluster Head Selection Algorithm Based on Fuzzy Clustering and Particle Swarm Optimization.

    Science.gov (United States)

    Ni, Qingjian; Pan, Qianqian; Du, Huimin; Cao, Cen; Zhai, Yuqing

    2017-01-01

    An important objective of wireless sensor network is to prolong the network life cycle, and topology control is of great significance for extending the network life cycle. Based on previous work, for cluster head selection in hierarchical topology control, we propose a solution based on fuzzy clustering preprocessing and particle swarm optimization. More specifically, first, fuzzy clustering algorithm is used to initial clustering for sensor nodes according to geographical locations, where a sensor node belongs to a cluster with a determined probability, and the number of initial clusters is analyzed and discussed. Furthermore, the fitness function is designed considering both the energy consumption and distance factors of wireless sensor network. Finally, the cluster head nodes in hierarchical topology are determined based on the improved particle swarm optimization. Experimental results show that, compared with traditional methods, the proposed method achieved the purpose of reducing the mortality rate of nodes and extending the network life cycle.

  18. DEMARCATE: Density-based magnetic resonance image clustering for assessing tumor heterogeneity in cancer.

    Science.gov (United States)

    Saha, Abhijoy; Banerjee, Sayantan; Kurtek, Sebastian; Narang, Shivali; Lee, Joonsang; Rao, Ganesh; Martinez, Juan; Bharath, Karthik; Rao, Arvind U K; Baladandayuthapani, Veerabhadran

    2016-01-01

    Tumor heterogeneity is a crucial area of cancer research wherein inter- and intra-tumor differences are investigated to assess and monitor disease development and progression, especially in cancer. The proliferation of imaging and linked genomic data has enabled us to evaluate tumor heterogeneity on multiple levels. In this work, we examine magnetic resonance imaging (MRI) in patients with brain cancer to assess image-based tumor heterogeneity. Standard approaches to this problem use scalar summary measures (e.g., intensity-based histogram statistics) that do not adequately capture the complete and finer scale information in the voxel-level data. In this paper, we introduce a novel technique, DEMARCATE (DEnsity-based MAgnetic Resonance image Clustering for Assessing Tumor hEterogeneity) to explore the entire tumor heterogeneity density profiles (THDPs) obtained from the full tumor voxel space. THDPs are smoothed representations of the probability density function of the tumor images. We develop tools for analyzing such objects under the Fisher-Rao Riemannian framework that allows us to construct metrics for THDP comparisons across patients, which can be used in conjunction with standard clustering approaches. Our analyses of The Cancer Genome Atlas (TCGA) based Glioblastoma dataset reveal two significant clusters of patients with marked differences in tumor morphology, genomic characteristics and prognostic clinical outcomes. In addition, we see enrichment of image-based clusters with known molecular subtypes of glioblastoma multiforme, which further validates our representation of tumor heterogeneity and subsequent clustering techniques.

  19. Cluster Physics with Merging Galaxy Clusters

    Directory of Open Access Journals (Sweden)

    Sandor M. Molnar

    2016-02-01

    Full Text Available Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard LCDM model, where the total density is dominated by the cosmological constant ($Lambda$ and the matter density by cold dark matter (CDM, structure formation is hierarchical, and clusters grow mostly by merging.Mergers of two massive clusters are the most energetic events in the universe after the Big Bang,hence they provide a unique laboratory to study cluster physics.The two main mass components in clusters behave differently during collisions:the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulenceare developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thusour review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clustersis to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses.New high spatial and spectral resolution ground and space based telescopeswill come online in the near future. Motivated by these new opportunities,we briefly discuss methods which will be feasible in the near future in studying merging clusters.

  20. Structure based alignment and clustering of proteins (STRALCP)

    Science.gov (United States)

    Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

    2013-06-18

    Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.

  1. Improving local clustering based top-L link prediction methods via asymmetric link clustering information

    Science.gov (United States)

    Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan

    2018-02-01

    Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.

  2. A Linux cluster for between-pulse magnetic equilibrium reconstructions and other processor bound analyses

    International Nuclear Information System (INIS)

    Peng, Q.; Groebner, R. J.; Lao, L. L.; Schachter, J.; Schissel, D. P.; Wade, M. R.

    2001-01-01

    A 12-processor Linux PC cluster has been installed to perform between-pulse magnetic equilibrium reconstructions during tokamak operations using the EFIT code written in FORTRAN. The MPICH package implementing message passing interface is employed by EFIT for data distribution and communication. The new system calculates equilibria eight times faster than the previous system yielding a complete equilibrium time history on a 25 ms time scale 4 min after the pulse ends. A graphical interface is provided for users to control the time resolution and the type of EFITs. The next analysis to benefit from the cluster is CERQUICK written in IDL for ion temperature profile analysis. The plan is to expand the cluster so that a full profile analysis (Te, Ti, ne, Vr, Zeff) can be made available between pulses, which lays the ground work for Kinetic EFIT and/or ONETWO power balance analyses

  3. A Versatile Software Package for Inter-subject Correlation Based Analyses of fMRI

    Directory of Open Access Journals (Sweden)

    Jukka-Pekka eKauppi

    2014-01-01

    Full Text Available In the inter-subject correlation (ISC based analysis of the functional magnetic resonance imaging (fMRI data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modelling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine or Open Grid Scheduler and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/.

  4. A versatile software package for inter-subject correlation based analyses of fMRI.

    Science.gov (United States)

    Kauppi, Jukka-Pekka; Pajula, Juha; Tohka, Jussi

    2014-01-01

    In the inter-subject correlation (ISC) based analysis of the functional magnetic resonance imaging (fMRI) data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modeling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI) based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine, or Open Grid Scheduler) and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/

  5. Cluster-based global firms' use of local capabilities

    DEFF Research Database (Denmark)

    Andersen, Poul Houman; Bøllingtoft, Anne

    2011-01-01

    Purpose – Despite growing interest in clusters role for the global competitiveness of firms, there has been little research into how globalization affects cluster-based firms’ (CBFs) use of local knowledge resources and the combination of local and global knowledge used. Using the cluster......’s knowledge base as a mediating variable, the purpose of this paper is to examine how globalization affected the studied firms’ use of local cluster-based knowledge, integration of local and global knowledge, and networking capabilities. Design/methodology/approach – Qualitative case studies of nine firms...... in three clusters strongly affected by increasing global division of labour. Findings – The paper suggests that globalization has affected how firms use local resources and combine local and global knowledge. Unexpectedly, clustered firms with explicit procedures and established global fora for exchanging...

  6. blockcluster: An R Package for Model-Based Co-Clustering

    Directory of Open Access Journals (Sweden)

    Parmeet Singh Bhatia

    2017-02-01

    Full Text Available Simultaneous clustering of rows and columns, usually designated by bi-clustering, coclustering or block clustering, is an important technique in two way data analysis. A new standard and efficient approach has been recently proposed based on the latent block model (Govaert and Nadif 2003 which takes into account the block clustering problem on both the individual and variable sets. This article presents our R package blockcluster for co-clustering of binary, contingency and continuous data based on these very models. In this document, we will give a brief review of the model-based block clustering methods, and we will show how the R package blockcluster can be used for co-clustering.

  7. Innovative Development of Building Materials Industry of the Region Based on the Cluster Approach

    Directory of Open Access Journals (Sweden)

    Mottaeva Asiiat

    2016-01-01

    Full Text Available The article discusses issues of innovative development of building materials industry of the region based on the cluster approach. Determined the significance of regional cluster development of the industry of construction materials as the effective implementation of the innovative breakthrough of the region as an important part of strategies for strengthening innovation activities may be to support the formation and development of cluster structures. Analyses the current situation with innovation in the building materials industry of the region based on the cluster approach. In the course of the study revealed a direct correlation between involvement in innovative activities on a cluster basis, and the level of development of industry of construction materials. The conducted research allowed identifying the factors that determine the innovation process, systematization and classification which determine the sustainable functioning of the building materials industry in the period of active innovation. The proposed grouping of innovations for the construction industry taking into account industry-specific characteristics that reflect modern trends of scientific and technological progress in construction. Significance of the study lies in the fact that the proposals and practical recommendations can be used in the formation mechanism of innovative development of building materials industry and the overall regional construction complex of Russian regions by creating clusters of construction.

  8. Mechanistic study on lowering the sensitivity of positive atmospheric pressure photoionization mass spectrometric analyses: size-dependent reactivity of solvent clusters.

    Science.gov (United States)

    Ahmed, Arif; Choi, Cheol Ho; Kim, Sunghwan

    2015-11-15

    Understanding the mechanism of atmospheric pressure photoionization (APPI) is important for studies employing APPI liquid chromatography/mass spectrometry (LC/MS). In this study, the APPI mechanism for polyaromatic hydrocarbon (PAH) compounds dissolved in toluene and methanol or water mixture was investigated by use of MS analysis and quantum mechanical simulation. In particular, four different mechanisms that could contribute to the signal reduction were considered based on a combination of MS data and quantum mechanical calculations. The APPI mechanism is clarified by combining MS data and density functional theory (DFT) calculations. To obtain MS data, a positive-mode (+) APPI Q Exactive Orbitrap mass spectrometer was used to analyze each solution. DFT calculations were performed using the general atomic and molecular electronic structure system (GAMESS). The experimental results indicated that methanol significantly reduced the signal in (+) APPI, but no significative signal reduction was observed when water was used as a co-solvent with toluene. The signal reduction is more significant especially for molecular ions than for protonated ions. Therefore, important information about the mechanism of methanol-induced signal reduction in (+) APPI-MS can be gained due its negative impact on APPI efficiency. The size-dependent reactivity of methanol clusters ((CH3 OH)n , n = 1-8) is an important factor in determining the sensitivity of (+) APPI-MS analyses. Clusters can compete with toluene radical ions for electrons. The reactivity increases as the sizes of the methanol clusters increase and this effect can be caused by the size-dependent ionization energy of the solvent clusters. The resulting increase in cluster reactivity explains the flow rate and temperature-dependent signal reduction observed in the analytes. Based on the results presented here, minimizing the sizes of methanol clusters can improve the sensitivity of LC/(+)-APPI-MS. Copyright © 2015 John

  9. Co-clustering models, algorithms and applications

    CERN Document Server

    Govaert, Gérard

    2013-01-01

    Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixture

  10. Cluster-based DBMS Management Tool with High-Availability

    Directory of Open Access Journals (Sweden)

    Jae-Woo Chang

    2005-02-01

    Full Text Available A management tool which is needed for monitoring and managing cluster-based DBMSs has been little studied. So, we design and implement a cluster-based DBMS management tool with high-availability that monitors the status of nodes in a cluster system as well as the status of DBMS instances in a node. The tool enables users to recognize a single virtual system image and provides them with the status of all the nodes and resources in the system by using a graphic user interface (GUI. By using a load balancer, our management tool can increase the performance of a cluster-based DBMS as well as can overcome the limitation of the existing parallel DBMSs.

  11. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  12. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. A Network-Based Algorithm for Clustering Multivariate Repeated Measures Data

    Science.gov (United States)

    Koslovsky, Matthew; Arellano, John; Schaefer, Caroline; Feiveson, Alan; Young, Millennia; Lee, Stuart

    2017-01-01

    The National Aeronautics and Space Administration (NASA) Astronaut Corps is a unique occupational cohort for which vast amounts of measures data have been collected repeatedly in research or operational studies pre-, in-, and post-flight, as well as during multiple clinical care visits. In exploratory analyses aimed at generating hypotheses regarding physiological changes associated with spaceflight exposure, such as impaired vision, it is of interest to identify anomalies and trends across these expansive datasets. Multivariate clustering algorithms for repeated measures data may help parse the data to identify homogeneous groups of astronauts that have higher risks for a particular physiological change. However, available clustering methods may not be able to accommodate the complex data structures found in NASA data, since the methods often rely on strict model assumptions, require equally-spaced and balanced assessment times, cannot accommodate missing data or differing time scales across variables, and cannot process continuous and discrete data simultaneously. To fill this gap, we propose a network-based, multivariate clustering algorithm for repeated measures data that can be tailored to fit various research settings. Using simulated data, we demonstrate how our method can be used to identify patterns in complex data structures found in practice.

  14. A Clustering-Oriented Closeness Measure Based on Neighborhood Chain and Its Application in the Clustering Ensemble Framework Based on the Fusion of Different Closeness Measures

    Directory of Open Access Journals (Sweden)

    Shaoyi Liang

    2017-09-01

    Full Text Available Closeness measures are crucial to clustering methods. In most traditional clustering methods, the closeness between data points or clusters is measured by the geometric distance alone. These metrics quantify the closeness only based on the concerned data points’ positions in the feature space, and they might cause problems when dealing with clustering tasks having arbitrary clusters shapes and different clusters densities. In this paper, we first propose a novel Closeness Measure between data points based on the Neighborhood Chain (CMNC. Instead of using geometric distances alone, CMNC measures the closeness between data points by quantifying the difficulty for one data point to reach another through a chain of neighbors. Furthermore, based on CMNC, we also propose a clustering ensemble framework that combines CMNC and geometric-distance-based closeness measures together in order to utilize both of their advantages. In this framework, the “bad data points” that are hard to cluster correctly are identified; then different closeness measures are applied to different types of data points to get the unified clustering results. With the fusion of different closeness measures, the framework can get not only better clustering results in complicated clustering tasks, but also higher efficiency.

  15. Novel density-based and hierarchical density-based clustering algorithms for uncertain data.

    Science.gov (United States)

    Zhang, Xianchao; Liu, Han; Zhang, Xiaotong

    2017-09-01

    Uncertain data has posed a great challenge to traditional clustering algorithms. Recently, several algorithms have been proposed for clustering uncertain data, and among them density-based techniques seem promising for handling data uncertainty. However, some issues like losing uncertain information, high time complexity and nonadaptive threshold have not been addressed well in the previous density-based algorithm FDBSCAN and hierarchical density-based algorithm FOPTICS. In this paper, we firstly propose a novel density-based algorithm PDBSCAN, which improves the previous FDBSCAN from the following aspects: (1) it employs a more accurate method to compute the probability that the distance between two uncertain objects is less than or equal to a boundary value, instead of the sampling-based method in FDBSCAN; (2) it introduces new definitions of probability neighborhood, support degree, core object probability, direct reachability probability, thus reducing the complexity and solving the issue of nonadaptive threshold (for core object judgement) in FDBSCAN. Then, we modify the algorithm PDBSCAN to an improved version (PDBSCANi), by using a better cluster assignment strategy to ensure that every object will be assigned to the most appropriate cluster, thus solving the issue of nonadaptive threshold (for direct density reachability judgement) in FDBSCAN. Furthermore, as PDBSCAN and PDBSCANi have difficulties for clustering uncertain data with non-uniform cluster density, we propose a novel hierarchical density-based algorithm POPTICS by extending the definitions of PDBSCAN, adding new definitions of fuzzy core distance and fuzzy reachability distance, and employing a new clustering framework. POPTICS can reveal the cluster structures of the datasets with different local densities in different regions better than PDBSCAN and PDBSCANi, and it addresses the issues in FOPTICS. Experimental results demonstrate the superiority of our proposed algorithms over the existing

  16. Beyond Apprenticeship: Knowledge Brokers and Sustainability of Apprentice-Based Clusters

    Directory of Open Access Journals (Sweden)

    Huasheng Zhu

    2016-12-01

    Full Text Available Knowledge learning and diffusion have long been discussed in the literature on the dynamics of industrial clusters, but recent literature provides little evidence for how different actors serve as knowledge brokers in the upgrading process of apprentice-based clusters, and does not dynamically consider how to preserve the sustainability of these clusters. This paper uses empirical evidence from an antique furniture manufacturing cluster in Xianyou, Fujian Province, in southeastern China, to examine the growth trajectory of the knowledge learning system of an antique furniture manufacturing cluster. It appears that the apprentice-based learning system is crucial during early stages of the cluster evolution, but later becomes complemented and relatively substituted by the role of both local governments and focal outsiders. This finding addresses the context of economic transformation and provides empirical insights into knowledge acquisition in apprentice-based clusters to question the rationality based on European and North American cases, and to provide a broader perspective for policy makers to trigger and sustain the development of apprentice-based clusters.

  17. Prioritizing the risk of plant pests by clustering methods; self-organising maps, k-means and hierarchical clustering

    Directory of Open Access Journals (Sweden)

    Susan Worner

    2013-09-01

    Full Text Available For greater preparedness, pest risk assessors are required to prioritise long lists of pest species with potential to establish and cause significant impact in an endangered area. Such prioritization is often qualitative, subjective, and sometimes biased, relying mostly on expert and stakeholder consultation. In recent years, cluster based analyses have been used to investigate regional pest species assemblages or pest profiles to indicate the risk of new organism establishment. Such an approach is based on the premise that the co-occurrence of well-known global invasive pest species in a region is not random, and that the pest species profile or assemblage integrates complex functional relationships that are difficult to tease apart. In other words, the assemblage can help identify and prioritise species that pose a threat in a target region. A computational intelligence method called a Kohonen self-organizing map (SOM, a type of artificial neural network, was the first clustering method applied to analyse assemblages of invasive pests. The SOM is a well known dimension reduction and visualization method especially useful for high dimensional data that more conventional clustering methods may not analyse suitably. Like all clustering algorithms, the SOM can give details of clusters that identify regions with similar pest assemblages, possible donor and recipient regions. More important, however SOM connection weights that result from the analysis can be used to rank the strength of association of each species within each regional assemblage. Species with high weights that are not already established in the target region are identified as high risk. However, the SOM analysis is only the first step in a process to assess risk to be used alongside or incorporated within other measures. Here we illustrate the application of SOM analyses in a range of contexts in invasive species risk assessment, and discuss other clustering methods such as k

  18. Cluster Synchronization of Diffusively Coupled Nonlinear Systems: A Contraction-Based Approach

    Science.gov (United States)

    Aminzare, Zahra; Dey, Biswadip; Davison, Elizabeth N.; Leonard, Naomi Ehrich

    2018-04-01

    Finding the conditions that foster synchronization in networked nonlinear systems is critical to understanding a wide range of biological and mechanical systems. However, the conditions proved in the literature for synchronization in nonlinear systems with linear coupling, such as has been used to model neuronal networks, are in general not strict enough to accurately determine the system behavior. We leverage contraction theory to derive new sufficient conditions for cluster synchronization in terms of the network structure, for a network where the intrinsic nonlinear dynamics of each node may differ. Our result requires that network connections satisfy a cluster-input-equivalence condition, and we explore the influence of this requirement on network dynamics. For application to networks of nodes with FitzHugh-Nagumo dynamics, we show that our new sufficient condition is tighter than those found in previous analyses that used smooth or nonsmooth Lyapunov functions. Improving the analytical conditions for when cluster synchronization will occur based on network configuration is a significant step toward facilitating understanding and control of complex networked systems.

  19. Cluster-based analysis of multi-model climate ensembles

    Science.gov (United States)

    Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.

    2018-06-01

    Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and

  20. Managing distance and covariate information with point-based clustering

    Directory of Open Access Journals (Sweden)

    Peter A. Whigham

    2016-09-01

    Full Text Available Abstract Background Geographic perspectives of disease and the human condition often involve point-based observations and questions of clustering or dispersion within a spatial context. These problems involve a finite set of point observations and are constrained by a larger, but finite, set of locations where the observations could occur. Developing a rigorous method for pattern analysis in this context requires handling spatial covariates, a method for constrained finite spatial clustering, and addressing bias in geographic distance measures. An approach, based on Ripley’s K and applied to the problem of clustering with deliberate self-harm (DSH, is presented. Methods Point-based Monte-Carlo simulation of Ripley’s K, accounting for socio-economic deprivation and sources of distance measurement bias, was developed to estimate clustering of DSH at a range of spatial scales. A rotated Minkowski L1 distance metric allowed variation in physical distance and clustering to be assessed. Self-harm data was derived from an audit of 2 years’ emergency hospital presentations (n = 136 in a New Zealand town (population ~50,000. Study area was defined by residential (housing land parcels representing a finite set of possible point addresses. Results Area-based deprivation was spatially correlated. Accounting for deprivation and distance bias showed evidence for clustering of DSH for spatial scales up to 500 m with a one-sided 95 % CI, suggesting that social contagion may be present for this urban cohort. Conclusions Many problems involve finite locations in geographic space that require estimates of distance-based clustering at many scales. A Monte-Carlo approach to Ripley’s K, incorporating covariates and models for distance bias, are crucial when assessing health-related clustering. The case study showed that social network structure defined at the neighbourhood level may account for aspects of neighbourhood clustering of DSH. Accounting for

  1. Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations.

    Science.gov (United States)

    Bragg, Elise M; Briggs, Farran

    2017-02-15

    This protocol outlines large-scale reconstructions of neurons combined with the use of independent and unbiased clustering analyses to create a comprehensive survey of the morphological characteristics observed among a selective neuronal population. Combination of these techniques constitutes a novel approach for the collection and analysis of neuroanatomical data. Together, these techniques enable large-scale, and therefore more comprehensive, sampling of selective neuronal populations and establish unbiased quantitative methods for describing morphologically unique neuronal classes within a population. The protocol outlines the use of modified rabies virus to selectively label neurons. G-deleted rabies virus acts like a retrograde tracer following stereotaxic injection into a target brain structure of interest and serves as a vehicle for the delivery and expression of EGFP in neurons. Large numbers of neurons are infected using this technique and express GFP throughout their dendrites, producing "Golgi-like" complete fills of individual neurons. Accordingly, the virus-mediated retrograde tracing method improves upon traditional dye-based retrograde tracing techniques by producing complete intracellular fills. Individual well-isolated neurons spanning all regions of the brain area under study are selected for reconstruction in order to obtain a representative sample of neurons. The protocol outlines procedures to reconstruct cell bodies and complete dendritic arborization patterns of labeled neurons spanning multiple tissue sections. Morphological data, including positions of each neuron within the brain structure, are extracted for further analysis. Standard programming functions were utilized to perform independent cluster analyses and cluster evaluations based on morphological metrics. To verify the utility of these analyses, statistical evaluation of a cluster analysis performed on 160 neurons reconstructed in the thalamic reticular nucleus of the thalamus

  2. Cluster-based Data Gathering in Long-Strip Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    FANG, W.

    2012-02-01

    Full Text Available This paper investigates a special class of wireless sensor networks that are different from traditional ones in that the sensor nodes in this class of networks are deployed along narrowly elongated geographical areas and form a long-strip topology. According to hardware capabilities of current sensor nodes, a cluster-based protocol for reliable and efficient data gathering in long-strip wireless sensor networks (LSWSN is proposed. A well-distributed cluster-based architecture is first formed in the whole network through contention-based cluster head election. Cluster heads are responsible for coordination among the nodes within their clusters and aggregation of their sensory data, as well as transmission the data to the sink node on behalf of their own clusters. The intra-cluster coordination is based on the traditional TDMA schedule, in which the inter-cluster interference caused by the border nodes is solved by the multi-channel communication technique. The cluster reporting is based on the CSMA contention, in which a connected overlay network is formed by relay nodes to forward the data from the cluster heads through multi-hops to the sink node. The relay nodes are non-uniformly deployed to resolve the energy-hole problem which is extremely serious in the LSWSN. Extensive simulation results illuminate the distinguished performance of the proposed protocol.

  3. X-ray aspects of the DAFT/FADA clusters

    Science.gov (United States)

    Guennou, L.; Durret, F.; Lima Neto, G. B.; Adami, C.

    2012-12-01

    We have undertaken the DAFT/FADA survey with the aim of applying constraints on dark energy based on weak lensing tomography as well as obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range [0.4,0.9] for which there are HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. We present preliminary results on the coupled X-ray and dynamical analyses of these clusters.

  4. Family-based clusters of cognitive test performance in familial schizophrenia

    Directory of Open Access Journals (Sweden)

    Partonen Timo

    2004-07-01

    Full Text Available Abstract Background Cognitive traits derived from neuropsychological test data are considered to be potential endophenotypes of schizophrenia. Previously, these traits have been found to form a valid basis for clustering samples of schizophrenia patients into homogeneous subgroups. We set out to identify such clusters, but apart from previous studies, we included both schizophrenia patients and family members into the cluster analysis. The aim of the study was to detect family clusters with similar cognitive test performance. Methods Test scores from 54 randomly selected families comprising at least two siblings with schizophrenia spectrum disorders, and at least two unaffected family members were included in a complete-linkage cluster analysis with interactive data visualization. Results A well-performing, an impaired, and an intermediate family cluster emerged from the analysis. While the neuropsychological test scores differed significantly between the clusters, only minor differences were observed in the clinical variables. Conclusions The visually aided clustering algorithm was successful in identifying family clusters comprising both schizophrenia patients and their relatives. The present classification method may serve as a basis for selecting phenotypically more homogeneous groups of families in subsequent genetic analyses.

  5. CORECLUSTER: A Degeneracy Based Graph Clustering Framework

    OpenAIRE

    Giatsidis , Christos; Malliaros , Fragkiskos; Thilikos , Dimitrios M. ,; Vazirgiannis , Michalis

    2014-01-01

    International audience; Graph clustering or community detection constitutes an important task forinvestigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such asspectral methods, typically suffer from high time and space complexity. In thisarticle, we present \\textsc{CoreCluster}, an efficient graph clusteringframework based on the concept of graph degeneracy, that can be used along withany known graph clusteri...

  6. Flowbca : A flow-based cluster algorithm in Stata

    NARCIS (Netherlands)

    Meekes, J.; Hassink, W.H.J.

    In this article, we introduce the Stata implementation of a flow-based cluster algorithm written in Mata. The main purpose of the flowbca command is to identify clusters based on relational data of flows. We illustrate the command by providing multiple applications, from the research fields of

  7. CBHRP: A Cluster Based Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, M. G.; Kabir, M. Hasnat; Rahim, M. Sajjadur; Ullah, Sk. Enayet

    2012-01-01

    A new two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP) is proposed in this paper. It is an extension of LEACH routing protocol. We introduce cluster head-set idea for cluster-based routing where several clusters are formed with the deployed sensors to collect information from target field. On rotation basis, a head-set member receives data from the neighbor nodes and transmits the aggregated results to the distance base station. This protocol ...

  8. Formation of stable products from cluster-cluster collisions

    International Nuclear Information System (INIS)

    Alamanova, Denitsa; Grigoryan, Valeri G; Springborg, Michael

    2007-01-01

    The formation of stable products from copper cluster-cluster collisions is investigated by using classical molecular-dynamics simulations in combination with an embedded-atom potential. The dependence of the product clusters on impact energy, relative orientation of the clusters, and size of the clusters is studied. The structures and total energies of the product clusters are analysed and compared with those of the colliding clusters before impact. These results, together with the internal temperature, are used in obtaining an increased understanding of cluster fusion processes

  9. The Business Cluster's Distribution e-Channels

    OpenAIRE

    Milan Davidovic

    2011-01-01

    The business cluster cooperative potential and business capability improvement are dependent on e-business implementation and business model change dynamics in cluster and his members based in new and existing distribution channels, customer relationships management and supplychain integration. In this work analyse cluster’s e-business models, e-commerce forms and distribution e-channels for three business cases: when cluster members are oriented on own business, on cooperative’s project or c...

  10. Cluster development in the SA tooling industry

    Directory of Open Access Journals (Sweden)

    Von Leipzig, Konrad

    2015-11-01

    Full Text Available This paper explores the concept of clustering in general, analysing research and experiences in different countries and regions, and summarising factors leading to success or contributing to failure of specific cluster initiatives. Based on this, requirements for the establishment of clusters are summarised. Next, initiatives especially in the South African tool and die making (TDM industry are considered. Through a benchmarking approach, the strengths and weaknesses of individual local tool rooms are analysed, and conclusions are drawn particularly about South African characteristics of the industry. From these results, and from structured interviews with individual tool room owners, difficulties in the establishment of a South African tooling cluster are explored, and specific areas of concern are pointed out.

  11. Reliability analysis of cluster-based ad-hoc networks

    International Nuclear Information System (INIS)

    Cook, Jason L.; Ramirez-Marquez, Jose Emmanuel

    2008-01-01

    The mobile ad-hoc wireless network (MAWN) is a new and emerging network scheme that is being employed in a variety of applications. The MAWN varies from traditional networks because it is a self-forming and dynamic network. The MAWN is free of infrastructure and, as such, only the mobile nodes comprise the network. Pairs of nodes communicate either directly or through other nodes. To do so, each node acts, in turn, as a source, destination, and relay of messages. The virtue of a MAWN is the flexibility this provides; however, the challenge for reliability analyses is also brought about by this unique feature. The variability and volatility of the MAWN configuration makes typical reliability methods (e.g. reliability block diagram) inappropriate because no single structure or configuration represents all manifestations of a MAWN. For this reason, new methods are being developed to analyze the reliability of this new networking technology. New published methods adapt to this feature by treating the configuration probabilistically or by inclusion of embedded mobility models. This paper joins both methods together and expands upon these works by modifying the problem formulation to address the reliability analysis of a cluster-based MAWN. The cluster-based MAWN is deployed in applications with constraints on networking resources such as bandwidth and energy. This paper presents the problem's formulation, a discussion of applicable reliability metrics for the MAWN, and illustration of a Monte Carlo simulation method through the analysis of several example networks

  12. Clustering structures of large proteins using multifractal analyses based on a 6-letter model and hydrophobicity scale of amino acids

    International Nuclear Information System (INIS)

    Yang Jianyi; Yu Zuguo; Anh, Vo

    2009-01-01

    The Schneider and Wrede hydrophobicity scale of amino acids and the 6-letter model of protein are proposed to study the relationship between the primary structure and the secondary structural classification of proteins. Two kinds of multifractal analyses are performed on the two measures obtained from these two kinds of data on large proteins. Nine parameters from the multifractal analyses are considered to construct the parameter spaces. Each protein is represented by one point in these spaces. A procedure is proposed to separate large proteins in the α, β, α + β and α/β structural classes in these parameter spaces. Fisher's linear discriminant algorithm is used to assess our clustering accuracy on the 49 selected large proteins. Numerical results indicate that the discriminant accuracies are satisfactory. In particular, they reach 100.00% and 84.21% in separating the α proteins from the {β, α + β, α/β} proteins in a parameter space; 92.86% and 86.96% in separating the β proteins from the {α + β, α/β} proteins in another parameter space; 91.67% and 83.33% in separating the α/β proteins from the α + β proteins in the last parameter space.

  13. APPECT: An Approximate Backbone-Based Clustering Algorithm for Tags

    DEFF Research Database (Denmark)

    Zong, Yu; Xu, Guandong; Jin, Pin

    2011-01-01

    algorithm for Tags (APPECT). The main steps of APPECT are: (1) we execute the K-means algorithm on a tag similarity matrix for M times and collect a set of tag clustering results Z={C1,C2,…,Cm}; (2) we form the approximate backbone of Z by executing a greedy search; (3) we fix the approximate backbone...... as the initial tag clustering result and then assign the rest tags into the corresponding clusters based on the similarity. Experimental results on three real world datasets namely MedWorm, MovieLens and Dmoz demonstrate the effectiveness and the superiority of the proposed method against the traditional...... Agglomerative Clustering on tagging data, which possess the inherent drawbacks, such as the sensitivity of initialization. In this paper, we instead make use of the approximate backbone of tag clustering results to find out better tag clusters. In particular, we propose an APProximate backbonE-based Clustering...

  14. Analyses on the formation of atmospheric particles and stabilized sulphuric acid clusters

    Energy Technology Data Exchange (ETDEWEB)

    Paasonen, P.

    2012-11-01

    Aerosol particles have various effects on our life. They affect the visibility and have diverse health effects, but are also applied in various applications, from drug inhalators to pesticides. Additionally, aerosol particles have manifold effects on the Earths' radiation budget and thus on the climate. The strength of the aerosol climate effect is one of the factors causing major uncertainties in the global climate models predicting the future climate change. Aerosol particles are emitted to atmosphere from various anthropogenic and biogenic sources, but they are also formed from precursor vapours in many parts of the world in a process called atmospheric new particle formation (NPF). The uncertainties in aerosol climate effect are partly due to the current lack of knowledge of the mechanisms governing the atmospheric NPF. It is known that gas phase sulphuric acid most certainly plays an important role in atmospheric NPF. However, also other vapours are needed in NPF, but the exact roles or even identities of these vapours are currently not exactly known. In this thesis I present some of the recent advancements in understanding of the atmospheric NPF in terms of the roles of the participating vapours and the meteorological conditions. Since direct measurements of new particle formation rate in the initial size scale of the formed particles (below 2 nm) are so far infrequent in both spatial and temporal scales, indirect methods are needed. The work presented on the following pages approaches the NPF from two directions: by analysing the observed formation rates of particles after they have grown to sizes measurable with widely applied instruments (2 nm or larger), and by measuring and modelling the initial sulphuric acid cluster formation. The obtained results can be summarized as follows. (1) The observed atmospheric new particle formation rates are typically connected with sulphuric acid concentration to the power close to two. (2) Also other compounds, most

  15. Cost/Performance Ratio Achieved by Using a Commodity-Based Cluster

    Science.gov (United States)

    Lopez, Isaac

    2001-01-01

    Researchers at the NASA Glenn Research Center acquired a commodity cluster based on Intel Corporation processors to compare its performance with a traditional UNIX cluster in the execution of aeropropulsion applications. Since the cost differential of the clusters was significant, a cost/performance ratio was calculated. After executing a propulsion application on both clusters, the researchers demonstrated a 9.4 cost/performance ratio in favor of the Intel-based cluster. These researchers utilize the Aeroshark cluster as one of the primary testbeds for developing NPSS parallel application codes and system software. The Aero-shark cluster provides 64 Intel Pentium II 400-MHz processors, housed in 32 nodes. Recently, APNASA - a code developed by a Government/industry team for the design and analysis of turbomachinery systems was used for a simulation on Glenn's Aeroshark cluster.

  16. Centroid based clustering of high throughput sequencing reads based on n-mer counts.

    Science.gov (United States)

    Solovyov, Alexander; Lipkin, W Ian

    2013-09-08

    Many problems in computational biology require alignment-free sequence comparisons. One of the common tasks involving sequence comparison is sequence clustering. Here we apply methods of alignment-free comparison (in particular, comparison using sequence composition) to the challenge of sequence clustering. We study several centroid based algorithms for clustering sequences based on word counts. Study of their performance shows that using k-means algorithm with or without the data whitening is efficient from the computational point of view. A higher clustering accuracy can be achieved using the soft expectation maximization method, whereby each sequence is attributed to each cluster with a specific probability. We implement an open source tool for alignment-free clustering. It is publicly available from github: https://github.com/luscinius/afcluster. We show the utility of alignment-free sequence clustering for high throughput sequencing analysis despite its limitations. In particular, it allows one to perform assembly with reduced resources and a minimal loss of quality. The major factor affecting performance of alignment-free read clustering is the length of the read.

  17. Parameterization and Observability Analysis of Scalable Battery Clusters for Onboard Thermal Management Paramétrage et analyse d’observabilité de clusters de batteries de taille variable pour une gestion thermique embarquée

    Directory of Open Access Journals (Sweden)

    Lin Xinfan

    2013-03-01

    paramétrage en ligne et un observateur adaptatif sont conçus pour une batterie cylindrique. Le modèle thermique à une seule cellule est ensuite agrandi afin de créer un modèle de cluster de batteries dans le but d’étudier le schéma de température du cluster. Les interconnexions thermiques modélisées entre les cellules incluent la conduction de chaleur de cellule à cellule et la convection au flux du liquide de refroidissement environnant. Une analyse d’observabilité est effectuée sur le cluster avant la conception, pour le pack, d’un observateur en boucle fermée. Sur la base de l’analyse, les lignes directrices permettant la détermination du nombre minimal de sondes requises et leurs positionnements exacts sont déduites permettant d’assurer l’observabilité de tous les états thermiques.

  18. Defining objective clusters for rabies virus sequences using affinity propagation clustering.

    Directory of Open Access Journals (Sweden)

    Susanne Fischer

    2018-01-01

    Full Text Available Rabies is caused by lyssaviruses, and is one of the oldest known zoonoses. In recent years, more than 21,000 nucleotide sequences of rabies viruses (RABV, from the prototype species rabies lyssavirus, have been deposited in public databases. Subsequent phylogenetic analyses in combination with metadata suggest geographic distributions of RABV. However, these analyses somewhat experience technical difficulties in defining verifiable criteria for cluster allocations in phylogenetic trees inviting for a more rational approach. Therefore, we applied a relatively new mathematical clustering algorythm named 'affinity propagation clustering' (AP to propose a standardized sub-species classification utilizing full-genome RABV sequences. Because AP has the advantage that it is computationally fast and works for any meaningful measure of similarity between data samples, it has previously been applied successfully in bioinformatics, for analysis of microarray and gene expression data, however, cluster analysis of sequences is still in its infancy. Existing (516 and original (46 full genome RABV sequences were used to demonstrate the application of AP for RABV clustering. On a global scale, AP proposed four clusters, i.e. New World cluster, Arctic/Arctic-like, Cosmopolitan, and Asian as previously assigned by phylogenetic studies. By combining AP with established phylogenetic analyses, it is possible to resolve phylogenetic relationships between verifiably determined clusters and sequences. This workflow will be useful in confirming cluster distributions in a uniform transparent manner, not only for RABV, but also for other comparative sequence analyses.

  19. Cluster-based spectrum sensing for cognitive radios with imperfect channel to cluster-head

    KAUST Repository

    Ben Ghorbel, Mahdi

    2012-04-01

    Spectrum sensing is considered as the first and main step for cognitive radio systems to achieve an efficient use of spectrum. Cooperation and clustering among cognitive radio users are two techniques that can be employed with spectrum sensing in order to improve the sensing performance by reducing miss-detection and false alarm. In this paper, within the framework of a clustering-based cooperative spectrum sensing scheme, we study the effect of errors in transmitting the local decisions from the secondary users to the cluster heads (or the fusion center), while considering non-identical channel conditions between the secondary users. Closed-form expressions for the global probabilities of detection and false alarm at the cluster head are derived. © 2012 IEEE.

  20. Cluster-based spectrum sensing for cognitive radios with imperfect channel to cluster-head

    KAUST Repository

    Ben Ghorbel, Mahdi; Nam, Haewoon; Alouini, Mohamed-Slim

    2012-01-01

    Spectrum sensing is considered as the first and main step for cognitive radio systems to achieve an efficient use of spectrum. Cooperation and clustering among cognitive radio users are two techniques that can be employed with spectrum sensing in order to improve the sensing performance by reducing miss-detection and false alarm. In this paper, within the framework of a clustering-based cooperative spectrum sensing scheme, we study the effect of errors in transmitting the local decisions from the secondary users to the cluster heads (or the fusion center), while considering non-identical channel conditions between the secondary users. Closed-form expressions for the global probabilities of detection and false alarm at the cluster head are derived. © 2012 IEEE.

  1. Graph-based clustering and data visualization algorithms

    CERN Document Server

    Vathy-Fogarassy, Ágnes

    2013-01-01

    This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on

  2. Energy Aware Cluster-Based Routing in Flying Ad-Hoc Networks.

    Science.gov (United States)

    Aadil, Farhan; Raza, Ali; Khan, Muhammad Fahad; Maqsood, Muazzam; Mehmood, Irfan; Rho, Seungmin

    2018-05-03

    Flying ad-hoc networks (FANETs) are a very vibrant research area nowadays. They have many military and civil applications. Limited battery energy and the high mobility of micro unmanned aerial vehicles (UAVs) represent their two main problems, i.e., short flight time and inefficient routing. In this paper, we try to address both of these problems by means of efficient clustering. First, we adjust the transmission power of the UAVs by anticipating their operational requirements. Optimal transmission range will have minimum packet loss ratio (PLR) and better link quality, which ultimately save the energy consumed during communication. Second, we use a variant of the K-Means Density clustering algorithm for selection of cluster heads. Optimal cluster heads enhance the cluster lifetime and reduce the routing overhead. The proposed model outperforms the state of the art artificial intelligence techniques such as Ant Colony Optimization-based clustering algorithm and Grey Wolf Optimization-based clustering algorithm. The performance of the proposed algorithm is evaluated in term of number of clusters, cluster building time, cluster lifetime and energy consumption.

  3. Energy Aware Cluster-Based Routing in Flying Ad-Hoc Networks

    Directory of Open Access Journals (Sweden)

    Farhan Aadil

    2018-05-01

    Full Text Available Flying ad-hoc networks (FANETs are a very vibrant research area nowadays. They have many military and civil applications. Limited battery energy and the high mobility of micro unmanned aerial vehicles (UAVs represent their two main problems, i.e., short flight time and inefficient routing. In this paper, we try to address both of these problems by means of efficient clustering. First, we adjust the transmission power of the UAVs by anticipating their operational requirements. Optimal transmission range will have minimum packet loss ratio (PLR and better link quality, which ultimately save the energy consumed during communication. Second, we use a variant of the K-Means Density clustering algorithm for selection of cluster heads. Optimal cluster heads enhance the cluster lifetime and reduce the routing overhead. The proposed model outperforms the state of the art artificial intelligence techniques such as Ant Colony Optimization-based clustering algorithm and Grey Wolf Optimization-based clustering algorithm. The performance of the proposed algorithm is evaluated in term of number of clusters, cluster building time, cluster lifetime and energy consumption.

  4. An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.

    Science.gov (United States)

    Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei

    2013-05-01

    Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.

  5. Nonuniform Sparse Data Clustering Cascade Algorithm Based on Dynamic Cumulative Entropy

    Directory of Open Access Journals (Sweden)

    Ning Li

    2016-01-01

    Full Text Available A small amount of prior knowledge and randomly chosen initial cluster centers have a direct impact on the accuracy of the performance of iterative clustering algorithm. In this paper we propose a new algorithm to compute initial cluster centers for k-means clustering and the best number of the clusters with little prior knowledge and optimize clustering result. It constructs the Euclidean distance control factor based on aggregation density sparse degree to select the initial cluster center of nonuniform sparse data and obtains initial data clusters by multidimensional diffusion density distribution. Multiobjective clustering approach based on dynamic cumulative entropy is adopted to optimize the initial data clusters and the best number of the clusters. The experimental results show that the newly proposed algorithm has good performance to obtain the initial cluster centers for the k-means algorithm and it effectively improves the clustering accuracy of nonuniform sparse data by about 5%.

  6. Density-Based Clustering with Geographical Background Constraints Using a Semantic Expression Model

    Directory of Open Access Journals (Sweden)

    Qingyun Du

    2016-05-01

    Full Text Available A semantics-based method for density-based clustering with constraints imposed by geographical background knowledge is proposed. In this paper, we apply an ontological approach to the DBSCAN (Density-Based Geospatial Clustering of Applications with Noise algorithm in the form of knowledge representation for constraint clustering. When used in the process of clustering geographic information, semantic reasoning based on a defined ontology and its relationships is primarily intended to overcome the lack of knowledge of the relevant geospatial data. Better constraints on the geographical knowledge yield more reasonable clustering results. This article uses an ontology to describe the four types of semantic constraints for geographical backgrounds: “No Constraints”, “Constraints”, “Cannot-Link Constraints”, and “Must-Link Constraints”. This paper also reports the implementation of a prototype clustering program. Based on the proposed approach, DBSCAN can be applied with both obstacle and non-obstacle constraints as a semi-supervised clustering algorithm and the clustering results are displayed on a digital map.

  7. Ontology-based topic clustering for online discussion data

    Science.gov (United States)

    Wang, Yongheng; Cao, Kening; Zhang, Xiaoming

    2013-03-01

    With the rapid development of online communities, mining and extracting quality knowledge from online discussions becomes very important for the industrial and marketing sector, as well as for e-commerce applications and government. Most of the existing techniques model a discussion as a social network of users represented by a user-based graph without considering the content of the discussion. In this paper we propose a new multilayered mode to analysis online discussions. The user-based and message-based representation is combined in this model. A novel frequent concept sets based clustering method is used to cluster the original online discussion network into topic space. Domain ontology is used to improve the clustering accuracy. Parallel methods are also used to make the algorithms scalable to very large data sets. Our experimental study shows that the model and algorithms are effective when analyzing large scale online discussion data.

  8. Result Diversification Based on Query-Specific Cluster Ranking

    NARCIS (Netherlands)

    J. He (Jiyin); E. Meij; M. de Rijke (Maarten)

    2011-01-01

    htmlabstractResult diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking,

  9. Microgrids Real-Time Pricing Based on Clustering Techniques

    Directory of Open Access Journals (Sweden)

    Hao Liu

    2018-05-01

    Full Text Available Microgrids are widely spreading in electricity markets worldwide. Besides the security and reliability concerns for these microgrids, their operators need to address consumers’ pricing. Considering the growth of smart grids and smart meter facilities, it is expected that microgrids will have some level of flexibility to determine real-time pricing for at least some consumers. As such, the key challenge is finding an optimal pricing model for consumers. This paper, accordingly, proposes a new pricing scheme in which microgrids are able to deploy clustering techniques in order to understand their consumers’ load profiles and then assign real-time prices based on their load profile patterns. An improved weighted fuzzy average k-means is proposed to cluster load curve of consumers in an optimal number of clusters, through which the load profile of each cluster is determined. Having obtained the load profile of each cluster, real-time prices are given to each cluster, which is the best price given to all consumers in that cluster.

  10. Likelihood-based inference for clustered line transect data

    DEFF Research Database (Denmark)

    Waagepetersen, Rasmus Plenge; Schweder, Tore

    The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...

  11. Result diversification based on query-specific cluster ranking

    NARCIS (Netherlands)

    He, J.; Meij, E.; de Rijke, M.

    2011-01-01

    Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification

  12. Likelihood-based inference for clustered line transect data

    DEFF Research Database (Denmark)

    Waagepetersen, Rasmus; Schweder, Tore

    2006-01-01

    The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...

  13. Analyzing Dynamic Probabilistic Risk Assessment Data through Topology-Based Clustering

    Energy Technology Data Exchange (ETDEWEB)

    Diego Mandelli; Dan Maljovec; BeiWang; Valerio Pascucci; Peer-Timo Bremer

    2013-09-01

    We investigate the use of a topology-based clustering technique on the data generated by dynamic event tree methodologies. The clustering technique we utilizes focuses on a domain-partitioning algorithm based on topological structures known as the Morse-Smale complex, which partitions the data points into clusters based on their uniform gradient flow behavior. We perform both end state analysis and transient analysis to classify the set of nuclear scenarios. We demonstrate our methodology on a dataset generated for a sodium-cooled fast reactor during an aircraft crash scenario. The simulation tracks the temperature of the reactor as well as the time for a recovery team to fix the passive cooling system. Combined with clustering results obtained previously through mean shift methodology, we present the user with complementary views of the data that help illuminate key features that may be otherwise hidden using a single methodology. By clustering the data, the number of relevant test cases to be selected for further analysis can be drastically reduced by selecting a representative from each cluster. Identifying the similarities of simulations within a cluster can also aid in the drawing of important conclusions with respect to safety analysis.

  14. Fuzzy Weight Cluster-Based Routing Algorithm for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Teng Gao

    2015-01-01

    Full Text Available Cluster-based protocol is a kind of important routing in wireless sensor networks. However, due to the uneven distribution of cluster heads in classical clustering algorithm, some nodes may run out of energy too early, which is not suitable for large-scale wireless sensor networks. In this paper, a distributed clustering algorithm based on fuzzy weighted attributes is put forward to ensure both energy efficiency and extensibility. On the premise of a comprehensive consideration of all attributes, the corresponding weight of each parameter is assigned by using the direct method of fuzzy engineering theory. Then, each node works out property value. These property values will be mapped to the time axis and be triggered by a timer to broadcast cluster headers. At the same time, the radio coverage method is adopted, in order to avoid collisions and to ensure the symmetrical distribution of cluster heads. The aggregated data are forwarded to the sink node in the form of multihop. The simulation results demonstrate that clustering algorithm based on fuzzy weighted attributes has a longer life expectancy and better extensibility than LEACH-like algorithms.

  15. Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster

    Science.gov (United States)

    Syakur, M. A.; Khotimah, B. K.; Rochman, E. M. S.; Satoto, B. D.

    2018-04-01

    Clustering is a data mining technique used to analyse data that has variations and the number of lots. Clustering was process of grouping data into a cluster, so they contained data that is as similar as possible and different from other cluster objects. SMEs Indonesia has a variety of customers, but SMEs do not have the mapping of these customers so they did not know which customers are loyal or otherwise. Customer mapping is a grouping of customer profiling to facilitate analysis and policy of SMEs in the production of goods, especially batik sales. Researchers will use a combination of K-Means method with elbow to improve efficient and effective k-means performance in processing large amounts of data. K-Means Clustering is a localized optimization method that is sensitive to the selection of the starting position from the midpoint of the cluster. So choosing the starting position from the midpoint of a bad cluster will result in K-Means Clustering algorithm resulting in high errors and poor cluster results. The K-means algorithm has problems in determining the best number of clusters. So Elbow looks for the best number of clusters on the K-means method. Based on the results obtained from the process in determining the best number of clusters with elbow method can produce the same number of clusters K on the amount of different data. The result of determining the best number of clusters with elbow method will be the default for characteristic process based on case study. Measurement of k-means value of k-means has resulted in the best clusters based on SSE values on 500 clusters of batik visitors. The result shows the cluster has a sharp decrease is at K = 3, so K as the cut-off point as the best cluster.

  16. Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

    Science.gov (United States)

    Hallac, David; Vare, Sagar; Boyd, Stephen; Leskovec, Jure

    2018-01-01

    Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios. PMID:29770257

  17. Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

    Directory of Open Access Journals (Sweden)

    M.F.M. Yunoh

    2015-06-01

    Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.

  18. Operational Numerical Weather Prediction systems based on Linux cluster architectures

    International Nuclear Information System (INIS)

    Pasqui, M.; Baldi, M.; Gozzini, B.; Maracchi, G.; Giuliani, G.; Montagnani, S.

    2005-01-01

    The progress in weather forecast and atmospheric science has been always closely linked to the improvement of computing technology. In order to have more accurate weather forecasts and climate predictions, more powerful computing resources are needed, in addition to more complex and better-performing numerical models. To overcome such a large computing request, powerful workstations or massive parallel systems have been used. In the last few years, parallel architectures, based on the Linux operating system, have been introduced and became popular, representing real high performance-low cost systems. In this work the Linux cluster experience achieved at the Laboratory far Meteorology and Environmental Analysis (LaMMA-CNR-IBIMET) is described and tips and performances analysed

  19. Splitting Strip Detector Clusters in Dense Environments

    CERN Document Server

    Nachman, Benjamin Philip; The ATLAS collaboration

    2018-01-01

    Tracking in high density environments, particularly in high energy jets, plays an important role in many physics analyses at the LHC. In such environments, there is significant degradation of track reconstruction performance. Between runs 1 and 2, ATLAS implemented an algorithm that splits pixel clusters originating from multiple charged particles, using charge information, resulting in the recovery of much of the lost efficiency. However, no attempt was made in prior work to split merged clusters in the Semi Conductor Tracker (SCT), which does not measure charge information. In spite of the lack of charge information in SCT, a cluster-splitting algorithm has been developed in this work. It is based primarily on the difference between the observed cluster width and the expected cluster width, which is derived from track incidence angle. The performance of this algorithm is found to be competitive with the existing pixel cluster splitting based on track information.

  20. Investigating the usefulness of a cluster-based trend analysis to detect visual field progression in patients with open-angle glaucoma.

    Science.gov (United States)

    Aoki, Shuichiro; Murata, Hiroshi; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Hirasawa, Kazunori; Shoji, Nobuyuki; Asaoka, Ryo

    2017-12-01

    To investigate the usefulness of the Octopus (Haag-Streit) EyeSuite's cluster trend analysis in glaucoma. Ten visual fields (VFs) with the Humphrey Field Analyzer (Carl Zeiss Meditec), spanning 7.7 years on average were obtained from 728 eyes of 475 primary open angle glaucoma patients. Mean total deviation (mTD) trend analysis and EyeSuite's cluster trend analysis were performed on various series of VFs (from 1st to 10th: VF1-10 to 6th to 10th: VF6-10). The results of the cluster-based trend analysis, based on different lengths of VF series, were compared against mTD trend analysis. Cluster-based trend analysis and mTD trend analysis results were significantly associated in all clusters and with all lengths of VF series. Between 21.2% and 45.9% (depending on VF series length and location) of clusters were deemed to progress when the mTD trend analysis suggested no progression. On the other hand, 4.8% of eyes were observed to progress using the mTD trend analysis when cluster trend analysis suggested no progression in any two (or more) clusters. Whole field trend analysis can miss local VF progression. Cluster trend analysis appears as robust as mTD trend analysis and useful to assess both sectorial and whole field progression. Cluster-based trend analyses, in particular the definition of two or more progressing cluster, may help clinicians to detect glaucomatous progression in a timelier manner than using a whole field trend analysis, without significantly compromising specificity. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  1. Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks.

    Science.gov (United States)

    Taamneh, Madhar; Taamneh, Salah; Alkheder, Sharaf

    2017-09-01

    Artificial neural networks (ANNs) have been widely used in predicting the severity of road traffic crashes. All available information about previously occurred accidents is typically used for building a single prediction model (i.e., classifier). Too little attention has been paid to the differences between these accidents, leading, in most cases, to build less accurate predictors. Hierarchical clustering is a well-known clustering method that seeks to group data by creating a hierarchy of clusters. Using hierarchical clustering and ANNs, a clustering-based classification approach for predicting the injury severity of road traffic accidents was proposed. About 6000 road accidents occurred over a six-year period from 2008 to 2013 in Abu Dhabi were used throughout this study. In order to reduce the amount of variation in data, hierarchical clustering was applied on the data set to organize it into six different forms, each with different number of clusters (i.e., clusters from 1 to 6). Two ANN models were subsequently built for each cluster of accidents in each generated form. The first model was built and validated using all accidents (training set), whereas only 66% of the accidents were used to build the second model, and the remaining 34% were used to test it (percentage split). Finally, the weighted average accuracy was computed for each type of models in each from of data. The results show that when testing the models using the training set, clustering prior to classification achieves (11%-16%) more accuracy than without using clustering, while the percentage split achieves (2%-5%) more accuracy. The results also suggest that partitioning the accidents into six clusters achieves the best accuracy if both types of models are taken into account.

  2. Cluster Based Hierarchical Routing Protocol for Wireless Sensor Network

    OpenAIRE

    Rashed, Md. Golam; Kabir, M. Hasnat; Rahim, Muhammad Sajjadur; Ullah, Shaikh Enayet

    2012-01-01

    The efficient use of energy source in a sensor node is most desirable criteria for prolong the life time of wireless sensor network. In this paper, we propose a two layer hierarchical routing protocol called Cluster Based Hierarchical Routing Protocol (CBHRP). We introduce a new concept called head-set, consists of one active cluster head and some other associate cluster heads within a cluster. The head-set members are responsible for control and management of the network. Results show that t...

  3. Cluster analysis of European Y-chromosomal STR haplotypes using the discrete Laplace method

    DEFF Research Database (Denmark)

    Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels

    2014-01-01

    The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models the probabi......The European Y-chromosomal short tandem repeat (STR) haplotype distribution has previously been analysed in various ways. Here, we introduce a new way of analysing population substructure using a new method based on clustering within the discrete Laplace exponential family that models...... the probability distribution of the Y-STR haplotypes. Creating a consistent statistical model of the haplotypes enables us to perform a wide range of analyses. Previously, haplotype frequency estimation using the discrete Laplace method has been validated. In this paper we investigate how the discrete Laplace...... method can be used for cluster analysis to further validate the discrete Laplace method. A very important practical fact is that the calculations can be performed on a normal computer. We identified two sub-clusters of the Eastern and Western European Y-STR haplotypes similar to results of previous...

  4. Analyzing the factors affecting network lifetime cluster-based wireless sensor network

    International Nuclear Information System (INIS)

    Malik, A.S.; Qureshi, A.

    2010-01-01

    Cluster-based wireless sensor networks enable the efficient utilization of the limited energy resources of the deployed sensor nodes and hence prolong the node as well as network lifetime. Low Energy Adaptive Clustering Hierarchy (Leach) is one of the most promising clustering protocol proposed for wireless sensor networks. This paper provides the energy utilization and lifetime analysis for cluster-based wireless sensor networks based upon LEACH protocol. Simulation results identify some important factors that induce unbalanced energy utilization between the sensor nodes and hence affect the network lifetime in these types of networks. These results highlight the need for a standardized, adaptive and distributed clustering technique that can increase the network lifetime by further balancing the energy utilization among sensor nodes. (author)

  5. A Cluster-Based Dual-Adaptive Topology Control Approach in Wireless Sensor Networks.

    Science.gov (United States)

    Gui, Jinsong; Zhou, Kai; Xiong, Naixue

    2016-09-25

    Multi-Input Multi-Output (MIMO) can improve wireless network performance. Sensors are usually single-antenna devices due to the high hardware complexity and cost, so several sensors are used to form virtual MIMO array, which is a desirable approach to efficiently take advantage of MIMO gains. Also, in large Wireless Sensor Networks (WSNs), clustering can improve the network scalability, which is an effective topology control approach. The existing virtual MIMO-based clustering schemes do not either fully explore the benefits of MIMO or adaptively determine the clustering ranges. Also, clustering mechanism needs to be further improved to enhance the cluster structure life. In this paper, we propose an improved clustering scheme for virtual MIMO-based topology construction (ICV-MIMO), which can determine adaptively not only the inter-cluster transmission modes but also the clustering ranges. Through the rational division of cluster head function and the optimization of cluster head selection criteria and information exchange process, the ICV-MIMO scheme effectively reduces the network energy consumption and improves the lifetime of the cluster structure when compared with the existing typical virtual MIMO-based scheme. Moreover, the message overhead and time complexity are still in the same order of magnitude.

  6. A fast density-based clustering algorithm for real-time Internet of Things stream.

    Science.gov (United States)

    Amini, Amineh; Saboohi, Hadi; Wah, Teh Ying; Herawan, Tutut

    2014-01-01

    Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets.

  7. Cluster Ensemble-Based Image Segmentation

    Directory of Open Access Journals (Sweden)

    Xiaoru Wang

    2013-07-01

    Full Text Available Image segmentation is the foundation of computer vision applications. In this paper, we propose a new cluster ensemble-based image segmentation algorithm, which overcomes several problems of traditional methods. We make two main contributions in this paper. First, we introduce the cluster ensemble concept to fuse the segmentation results from different types of visual features effectively, which can deliver a better final result and achieve a much more stable performance for broad categories of images. Second, we exploit the PageRank idea from Internet applications and apply it to the image segmentation task. This can improve the final segmentation results by combining the spatial information of the image and the semantic similarity of regions. Our experiments on four public image databases validate the superiority of our algorithm over conventional single type of feature or multiple types of features-based algorithms, since our algorithm can fuse multiple types of features effectively for better segmentation results. Moreover, our method is also proved to be very competitive in comparison with other state-of-the-art segmentation algorithms.

  8. Two Zn coordination polymers with meso-helical chains based on mononuclear or dinuclear cluster units

    Energy Technology Data Exchange (ETDEWEB)

    Qin, Ling, E-mail: qinling@hfut.edu.cn [Department of Chemical Engineering and Food Processing, Xuancheng Campus, Hefei University of Technology, Xuancheng 242000, Anhui (China); Jiangsu Engineering Technology Research Center of Environmental Cleaning Materials (CEM), School of Environmental Sciences and Engineering, Nanjing University of Information Science and Technology (China); State Key Laboratory of Coordination Chemistry, School of Chemistry and Chemical Engineering, Nanjing National Laboratory of Microstructures, Nanjing University, Nanjing 210093 (China); Qiao, Wen-Cheng; Zuo, Wei-Juan; Zeng, Si-Ying; Mei, Cao; Liu, Chang-Jiang [Department of Chemical Engineering and Food Processing, Xuancheng Campus, Hefei University of Technology, Xuancheng 242000, Anhui (China)

    2016-07-15

    Two zinc coordination polymers {[Zn_2(TPPBDA)(oba)_2]·DMF·1.5H_2O}{sub n} (1), {[Zn(TPPBDA)_1_/_2(tpdc)]·DMF}{sub n} (2) have been synthesized by zinc metal salt, nanosized tetradentate pyridine ligand with flexible or rigid V-shaped carboxylate co-ligands. These complexes were characterized by elemental analyses and X-ray single-crystal diffraction analyses. Compound 1 is a 2-fold interpenetrated 3D framework with [Zn{sub 2}(CO{sub 2}){sub 4}] clusters. Compound 2 can be defined as a five folded interpenetrating bbf topology with mononuclear Zn{sup 2+}. These mononuclear or dinuclear cluster units are linked by mix-ligands, resulting in various degrees of interpenetration. In addition, the photoluminescent properties for TPPBDA ligand under different state and coordination polymers have been investigated in detail. - Graphical abstract: Two zinc coordination polymers have been synthesized by zinc metal salt, nanosized tetradentate pyridine ligand with flexible or rigid V-shaped carboxylate co-ligands. Compound 1 is a 2-fold interpenetrated 3D framework with [Zn{sub 2}(CO{sub 2}){sub 4}] clusters. Compound 2 can be defined as a five folded interpenetrating bbf topology with mononuclear Zn{sup 2+}. In addition, the photoluminescent properties for TPPBDA ligand under different status and coordination polymers have been investigated in detail. Display Omitted - Highlights: • Two Zn coordination polymers based on mononuclear or dinuclear cluster units have been synthesized. • Compound 1 is a 2-fold interpenetrated 3D framework with [Zn{sub 2}(CO{sub 2}){sub 4}] clusters. • Compound 2 is a five folded interpenetrating bbf topology with mononuclear Zn{sup 2+}. • The photoluminescent properties for TPPBDA with different state and two coordination polymers have been investigated.

  9. The anterior hypothalamus in cluster headache.

    Science.gov (United States)

    Arkink, Enrico B; Schmitz, Nicole; Schoonman, Guus G; van Vliet, Jorine A; Haan, Joost; van Buchem, Mark A; Ferrari, Michel D; Kruit, Mark C

    2017-10-01

    Objective To evaluate the presence, localization, and specificity of structural hypothalamic and whole brain changes in cluster headache and chronic paroxysmal hemicrania (CPH). Methods We compared T1-weighted magnetic resonance images of subjects with cluster headache (episodic n = 24; chronic n = 23; probable n = 14), CPH ( n = 9), migraine (with aura n = 14; without aura n = 19), and no headache ( n = 48). We applied whole brain voxel-based morphometry (VBM) using two complementary methods to analyze structural changes in the hypothalamus: region-of-interest analyses in whole brain VBM, and manual segmentation of the hypothalamus to calculate volumes. We used both conservative VBM thresholds, correcting for multiple comparisons, and less conservative thresholds for exploratory purposes. Results Using region-of-interest VBM analyses mirrored to the headache side, we found enlargement ( p cluster headache compared to controls, and in all participants with episodic or chronic cluster headache taken together compared to migraineurs. After manual segmentation, hypothalamic volume (mean±SD) was larger ( p cluster headache compared to controls (1.72 ± 0.15 ml) and migraineurs (1.68 ± 0.19 ml). Similar but non-significant trends were observed for participants with probable cluster headache (1.82 ± 0.19 ml; p = 0.07) and CPH (1.79 ± 0.20 ml; p = 0.15). Increased hypothalamic volume was primarily explained by bilateral enlargement of the anterior hypothalamus. Exploratory whole brain VBM analyses showed widespread changes in pain-modulating areas in all subjects with headache. Interpretation The anterior hypothalamus is enlarged in episodic and chronic cluster headache and possibly also in probable cluster headache or CPH, but not in migraine.

  10. X-ray and optical substructures of the DAFT/FADA survey clusters

    Science.gov (United States)

    Guennou, L.; Durret, F.; Adami, C.; Lima Neto, G. B.

    2013-04-01

    We have undertaken the DAFT/FADA survey with the double aim of setting constraints on dark energy based on weak lensing tomography and of obtaining homogeneous and high quality data for a sample of 91 massive clusters in the redshift range 0.4-0.9 for which there were HST archive data. We have analysed the XMM-Newton data available for 42 of these clusters to derive their X-ray temperatures and luminosities and search for substructures. Out of these, a spatial analysis was possible for 30 clusters, but only 23 had deep enough X-ray data for a really robust analysis. This study was coupled with a dynamical analysis for the 26 clusters having at least 30 spectroscopic galaxy redshifts in the cluster range. Altogether, the X-ray sample of 23 clusters and the optical sample of 26 clusters have 14 clusters in common. We present preliminary results on the coupled X-ray and dynamical analyses of these 14 clusters.

  11. XML documents cluster research based on frequent subpatterns

    Science.gov (United States)

    Ding, Tienan; Li, Wei; Li, Xiongfei

    2015-12-01

    XML data is widely used in the information exchange field of Internet, and XML document data clustering is the hot research topic. In the XML document clustering process, measure differences between two XML documents is time costly, and impact the efficiency of XML document clustering. This paper proposed an XML documents clustering method based on frequent patterns of XML document dataset, first proposed a coding tree structure for encoding the XML document, and translate frequent pattern mining from XML documents into frequent pattern mining from string. Further, using the cosine similarity calculation method and cohesive hierarchical clustering method for XML document dataset by frequent patterns. Because of frequent patterns are subsets of the original XML document data, so the time consumption of XML document similarity measure is reduced. The experiment runs on synthetic dataset and the real datasets, the experimental result shows that our method is efficient.

  12. Local Community Detection Algorithm Based on Minimal Cluster

    Directory of Open Access Journals (Sweden)

    Yong Zhou

    2016-01-01

    Full Text Available In order to discover the structure of local community more effectively, this paper puts forward a new local community detection algorithm based on minimal cluster. Most of the local community detection algorithms begin from one node. The agglomeration ability of a single node must be less than multiple nodes, so the beginning of the community extension of the algorithm in this paper is no longer from the initial node only but from a node cluster containing this initial node and nodes in the cluster are relatively densely connected with each other. The algorithm mainly includes two phases. First it detects the minimal cluster and then finds the local community extended from the minimal cluster. Experimental results show that the quality of the local community detected by our algorithm is much better than other algorithms no matter in real networks or in simulated networks.

  13. Macroeconomic Dimensions in the Clusterization Processes: Lithuanian Biomass Cluster Case

    Directory of Open Access Journals (Sweden)

    Navickas Valentinas

    2017-03-01

    Full Text Available The Future production systems’ increasing significance will impose work, which maintains not a competitive, but a collaboration basis, with concentrated resources and expertise, which can help to reach the general purpose. One form of collaboration among medium-size business organizations is work in clusters. Clusterization as a phenomenon has been known from quite a long time, but it offers simple benefits to researches at micro and medium levels. The clusterization process evaluation in macroeconomic dimensions has been comparatively little investigated. Thereby, in this article, the clusterization processes is analysed by concentrating our attention on macroeconomic factor researches. The authors analyse clusterization’s influence on country’s macroeconomic growth; they apply a structure research methodology for clusterization’s macroeconomic influence evaluation and propose that clusterization processes benefit macroeconomic analysis. The theoretical model of clusterization processes was validated by referring to a biomass cluster case. Because biomass cluster case is a new phenomenon, currently there are no other scientific approaches to them. The authors’ accomplished researches show that clusterization allows the achievement of a large positive slip in macroeconomics, which proves to lead to a high value added to creation, a faster country economic growth, and social situation amelioration.

  14. Fault-tolerant measurement-based quantum computing with continuous-variable cluster states.

    Science.gov (United States)

    Menicucci, Nicolas C

    2014-03-28

    A long-standing open question about Gaussian continuous-variable cluster states is whether they enable fault-tolerant measurement-based quantum computation. The answer is yes. Initial squeezing in the cluster above a threshold value of 20.5 dB ensures that errors from finite squeezing acting on encoded qubits are below the fault-tolerance threshold of known qubit-based error-correcting codes. By concatenating with one of these codes and using ancilla-based error correction, fault-tolerant measurement-based quantum computation of theoretically indefinite length is possible with finitely squeezed cluster states.

  15. A Cluster-Based Dual-Adaptive Topology Control Approach in Wireless Sensor Networks

    Science.gov (United States)

    Gui, Jinsong; Zhou, Kai; Xiong, Naixue

    2016-01-01

    Multi-Input Multi-Output (MIMO) can improve wireless network performance. Sensors are usually single-antenna devices due to the high hardware complexity and cost, so several sensors are used to form virtual MIMO array, which is a desirable approach to efficiently take advantage of MIMO gains. Also, in large Wireless Sensor Networks (WSNs), clustering can improve the network scalability, which is an effective topology control approach. The existing virtual MIMO-based clustering schemes do not either fully explore the benefits of MIMO or adaptively determine the clustering ranges. Also, clustering mechanism needs to be further improved to enhance the cluster structure life. In this paper, we propose an improved clustering scheme for virtual MIMO-based topology construction (ICV-MIMO), which can determine adaptively not only the inter-cluster transmission modes but also the clustering ranges. Through the rational division of cluster head function and the optimization of cluster head selection criteria and information exchange process, the ICV-MIMO scheme effectively reduces the network energy consumption and improves the lifetime of the cluster structure when compared with the existing typical virtual MIMO-based scheme. Moreover, the message overhead and time complexity are still in the same order of magnitude. PMID:27681731

  16. A Cluster-Based Dual-Adaptive Topology Control Approach in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Jinsong Gui

    2016-09-01

    Full Text Available Multi-Input Multi-Output (MIMO can improve wireless network performance. Sensors are usually single-antenna devices due to the high hardware complexity and cost, so several sensors are used to form virtual MIMO array, which is a desirable approach to efficiently take advantage of MIMO gains. Also, in large Wireless Sensor Networks (WSNs, clustering can improve the network scalability, which is an effective topology control approach. The existing virtual MIMO-based clustering schemes do not either fully explore the benefits of MIMO or adaptively determine the clustering ranges. Also, clustering mechanism needs to be further improved to enhance the cluster structure life. In this paper, we propose an improved clustering scheme for virtual MIMO-based topology construction (ICV-MIMO, which can determine adaptively not only the inter-cluster transmission modes but also the clustering ranges. Through the rational division of cluster head function and the optimization of cluster head selection criteria and information exchange process, the ICV-MIMO scheme effectively reduces the network energy consumption and improves the lifetime of the cluster structure when compared with the existing typical virtual MIMO-based scheme. Moreover, the message overhead and time complexity are still in the same order of magnitude.

  17. An Energy Centric Cluster-Based Routing Protocol for Wireless Sensor Networks.

    Science.gov (United States)

    Hosen, A S M Sanwar; Cho, Gi Hwan

    2018-05-11

    Clustering is an effective way to prolong the lifetime of a wireless sensor network (WSN). The common approach is to elect cluster heads to take routing and controlling duty, and to periodically rotate each cluster head's role to distribute energy consumption among nodes. However, a significant amount of energy dissipates due to control messages overhead, which results in a shorter network lifetime. This paper proposes an energy-centric cluster-based routing mechanism in WSNs. To begin with, cluster heads are elected based on the higher ranks of the nodes. The rank is defined by residual energy and average distance from the member nodes. With the role of data aggregation and data forwarding, a cluster head acts as a caretaker for cluster-head election in the next round, where the ranks' information are piggybacked along with the local data sending during intra-cluster communication. This reduces the number of control messages for the cluster-head election as well as the cluster formation in detail. Simulation results show that our proposed protocol saves the energy consumption among nodes and achieves a significant improvement in the network lifetime.

  18. Carbon based nanostructures: diamond clusters structured with nanotubes

    Directory of Open Access Journals (Sweden)

    O.A. Shenderova

    2003-01-01

    Full Text Available Feasibility of designing composites from carbon nanotubes and nanodiamond clusters is discussed based on atomistic simulations. Depending on nanotube size and morphology, some types of open nanotubes can be chemically connected with different facets of diamond clusters. The geometrical relation between different types of nanotubes and different diamond facets for construction of mechanically stable composites with all bonds saturated is summarized. Potential applications of the suggested nanostructures are briefly discussed based on the calculations of their electronic properties using environment dependent self-consistent tight-binding approach.

  19. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  20. A Fast Density-Based Clustering Algorithm for Real-Time Internet of Things Stream

    Science.gov (United States)

    Ying Wah, Teh

    2014-01-01

    Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753

  1. Variable selection in multivariate calibration based on clustering of variable concept.

    Science.gov (United States)

    Farrokhnia, Maryam; Karimi, Sadegh

    2016-01-01

    Recently we have proposed a new variable selection algorithm, based on clustering of variable concept (CLoVA) in classification problem. With the same idea, this new concept has been applied to a regression problem and then the obtained results have been compared with conventional variable selection strategies for PLS. The basic idea behind the clustering of variable is that, the instrument channels are clustered into different clusters via clustering algorithms. Then, the spectral data of each cluster are subjected to PLS regression. Different real data sets (Cargill corn, Biscuit dough, ACE QSAR, Soy, and Tablet) have been used to evaluate the influence of the clustering of variables on the prediction performances of PLS. Almost in the all cases, the statistical parameter especially in prediction error shows the superiority of CLoVA-PLS respect to other variable selection strategies. Finally the synergy clustering of variable (sCLoVA-PLS), which is used the combination of cluster, has been proposed as an efficient and modification of CLoVA algorithm. The obtained statistical parameter indicates that variable clustering can split useful part from redundant ones, and then based on informative cluster; stable model can be reached. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Heterologous Reconstitution of the Intact Geodin Gene Cluster in Aspergillus nidulans through a Simple and Versatile PCR Based Approach

    DEFF Research Database (Denmark)

    Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere

    2013-01-01

    was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... of solid methodology for genetic manipulation of most species severely hampers pathway haracterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus...... successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC) encodes a polyketide synthase, ATEG_08453 (gedR) encodes a transcription factor...

  3. Price Formation Based on Particle-Cluster Aggregation

    Science.gov (United States)

    Wang, Shijun; Zhang, Changshui

    In the present work, we propose a microscopic model of financial markets based on particle-cluster aggregation on a two-dimensional small-world information network in order to simulate the dynamics of the stock markets. "Stylized facts" of the financial market time series, such as fat-tail distribution of returns, volatility clustering and multifractality, are observed in the model. The results of the model agree with empirical data taken from historical records of the daily closures of the NYSE composite index.

  4. Short-Term Wind Power Forecasting Based on Clustering Pre-Calculated CFD Method

    Directory of Open Access Journals (Sweden)

    Yimei Wang

    2018-04-01

    Full Text Available To meet the increasing wind power forecasting (WPF demands of newly built wind farms without historical data, physical WPF methods are widely used. The computational fluid dynamics (CFD pre-calculated flow fields (CPFF-based WPF is a promising physical approach, which can balance well the competing demands of computational efficiency and accuracy. To enhance its adaptability for wind farms in complex terrain, a WPF method combining wind turbine clustering with CPFF is first proposed where the wind turbines in the wind farm are clustered and a forecasting is undertaken for each cluster. K-means, hierarchical agglomerative and spectral analysis methods are used to establish the wind turbine clustering models. The Silhouette Coefficient, Calinski-Harabaz index and within-between index are proposed as criteria to evaluate the effectiveness of the established clustering models. Based on different clustering methods and schemes, various clustering databases are built for clustering pre-calculated CFD (CPCC-based short-term WPF. For the wind farm case studied, clustering evaluation criteria show that hierarchical agglomerative clustering has reasonable results, spectral clustering is better and K-means gives the best performance. The WPF results produced by different clustering databases also prove the effectiveness of the three evaluation criteria in turn. The newly developed CPCC model has a much higher WPF accuracy than the CPFF model without using clustering techniques, both on temporal and spatial scales. The research provides supports for both the development and improvement of short-term physical WPF systems.

  5. Spanning Tree Based Attribute Clustering

    DEFF Research Database (Denmark)

    Zeng, Yifeng; Jorge, Cordero Hernandez

    2009-01-01

    Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algorithm. The algorithm partitions and indirectly discards...... inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering...

  6. Galaxy clusters in the cosmic web

    Science.gov (United States)

    Acebrón, A.; Durret, F.; Martinet, N.; Adami, C.; Guennou, L.

    2014-12-01

    Simulations of large scale structure formation in the universe predict that matter is essentially distributed along filaments at the intersection of which lie galaxy clusters. We have analysed 9 clusters in the redshift range 0.4DAFT/FADA survey, which combines deep large field multi-band imaging and spectroscopic data, in order to detect filaments and/or structures around these clusters. Based on colour-magnitude diagrams, we have selected the galaxies likely to be in the cluster redshift range and studied their spatial distribution. We detect a number of structures and filaments around several clusters, proving that colour-magnitude diagrams are a reliable method to detect filaments around galaxy clusters. Since this method excludes blue (spiral) galaxies at the cluster redshift, we also apply the LePhare software to compute photometric redshifts from BVRIZ images to select galaxy cluster members and study their spatial distribution. We then find that, if only galaxies classified as early-type by LePhare are considered, we obtain the same distribution than with a red sequence selection, while taking into account late-type galaxies just pollutes the background level and deteriorates our detections. The photometric redshift based method therefore does not provide any additional information.

  7. Fuzzy clustering-based segmented attenuation correction in whole-body PET

    CERN Document Server

    Zaidi, H; Boudraa, A; Slosman, DO

    2001-01-01

    Segmented-based attenuation correction is now a widely accepted technique to reduce noise contribution of measured attenuation correction. In this paper, we present a new method for segmenting transmission images in positron emission tomography. This reduces the noise on the correction maps while still correcting for differing attenuation coefficients of specific tissues. Based on the Fuzzy C-Means (FCM) algorithm, the method segments the PET transmission images into a given number of clusters to extract specific areas of differing attenuation such as air, the lungs and soft tissue, preceded by a median filtering procedure. The reconstructed transmission image voxels are therefore segmented into populations of uniform attenuation based on the human anatomy. The clustering procedure starts with an over-specified number of clusters followed by a merging process to group clusters with similar properties and remove some undesired substructures using anatomical knowledge. The method is unsupervised, adaptive and a...

  8. Spike sorting using locality preserving projection with gap statistics and landmark-based spectral clustering.

    Science.gov (United States)

    Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

    2014-12-30

    Understanding neural functions requires knowledge from analysing electrophysiological data. The process of assigning spikes of a multichannel signal into clusters, called spike sorting, is one of the important problems in such analysis. There have been various automated spike sorting techniques with both advantages and disadvantages regarding accuracy and computational costs. Therefore, developing spike sorting methods that are highly accurate and computationally inexpensive is always a challenge in the biomedical engineering practice. An automatic unsupervised spike sorting method is proposed in this paper. The method uses features extracted by the locality preserving projection (LPP) algorithm. These features afterwards serve as inputs for the landmark-based spectral clustering (LSC) method. Gap statistics (GS) is employed to evaluate the number of clusters before the LSC can be performed. The proposed LPP-LSC is highly accurate and computationally inexpensive spike sorting approach. LPP spike features are very discriminative; thereby boost the performance of clustering methods. Furthermore, the LSC method exhibits its efficiency when integrated with the cluster evaluator GS. The proposed method's accuracy is approximately 13% superior to that of the benchmark combination between wavelet transformation and superparamagnetic clustering (WT-SPC). Additionally, LPP-LSC computing time is six times less than that of the WT-SPC. LPP-LSC obviously demonstrates a win-win spike sorting solution meeting both accuracy and computational cost criteria. LPP and LSC are linear algorithms that help reduce computational burden and thus their combination can be applied into real-time spike analysis. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. ENERGY OPTIMIZATION IN CLUSTER BASED WIRELESS SENSOR NETWORKS

    Directory of Open Access Journals (Sweden)

    T. SHANKAR

    2014-04-01

    Full Text Available Wireless sensor networks (WSN are made up of sensor nodes which are usually battery-operated devices, and hence energy saving of sensor nodes is a major design issue. To prolong the networks lifetime, minimization of energy consumption should be implemented at all layers of the network protocol stack starting from the physical to the application layer including cross-layer optimization. Optimizing energy consumption is the main concern for designing and planning the operation of the WSN. Clustering technique is one of the methods utilized to extend lifetime of the network by applying data aggregation and balancing energy consumption among sensor nodes of the network. This paper proposed new version of Low Energy Adaptive Clustering Hierarchy (LEACH, protocols called Advanced Optimized Low Energy Adaptive Clustering Hierarchy (AOLEACH, Optimal Deterministic Low Energy Adaptive Clustering Hierarchy (ODLEACH, and Varying Probability Distance Low Energy Adaptive Clustering Hierarchy (VPDL combination with Shuffled Frog Leap Algorithm (SFLA that enables selecting best optimal adaptive cluster heads using improved threshold energy distribution compared to LEACH protocol and rotating cluster head position for uniform energy dissipation based on energy levels. The proposed algorithm optimizing the life time of the network by increasing the first node death (FND time and number of alive nodes, thereby increasing the life time of the network.

  10. Application of a clustering-based peak alignment algorithm to analyze various DNA fingerprinting data.

    Science.gov (United States)

    Ishii, Satoshi; Kadota, Koji; Senoo, Keishi

    2009-09-01

    DNA fingerprinting analysis such as amplified ribosomal DNA restriction analysis (ARDRA), repetitive extragenic palindromic PCR (rep-PCR), ribosomal intergenic spacer analysis (RISA), and denaturing gradient gel electrophoresis (DGGE) are frequently used in various fields of microbiology. The major difficulty in DNA fingerprinting data analysis is the alignment of multiple peak sets. We report here an R program for a clustering-based peak alignment algorithm, and its application to analyze various DNA fingerprinting data, such as ARDRA, rep-PCR, RISA, and DGGE data. The results obtained by our clustering algorithm and by BioNumerics software showed high similarity. Since several R packages have been established to statistically analyze various biological data, the distance matrix obtained by our R program can be used for subsequent statistical analyses, some of which were not previously performed but are useful in DNA fingerprinting studies.

  11. KM-FCM: A fuzzy clustering optimization algorithm based on Mahalanobis distance

    Directory of Open Access Journals (Sweden)

    Zhiwen ZU

    2018-04-01

    Full Text Available The traditional fuzzy clustering algorithm uses Euclidean distance as the similarity criterion, which is disadvantageous to the multidimensional data processing. In order to solve this situation, Mahalanobis distance is used instead of the traditional Euclidean distance, and the optimization of fuzzy clustering algorithm based on Mahalanobis distance is studied to enhance the clustering effect and ability. With making the initialization means by Heuristic search algorithm combined with k-means algorithm, and in terms of the validity function which could automatically adjust the optimal clustering number, an optimization algorithm KM-FCM is proposed. The new algorithm is compared with FCM algorithm, FCM-M algorithm and M-FCM algorithm in three standard data sets. The experimental results show that the KM-FCM algorithm is effective. It has higher clustering accuracy than FCM, FCM-M and M-FCM, recognizing high-dimensional data clustering well. It has global optimization effect, and the clustering number has no need for setting in advance. The new algorithm provides a reference for the optimization of fuzzy clustering algorithm based on Mahalanobis distance.

  12. Seminal Quality Prediction Using Clustering-Based Decision Forests

    Directory of Open Access Journals (Sweden)

    Hong Wang

    2014-08-01

    Full Text Available Prediction of seminal quality with statistical learning tools is an emerging methodology in decision support systems in biomedical engineering and is very useful in early diagnosis of seminal patients and selection of semen donors candidates. However, as is common in medical diagnosis, seminal quality prediction faces the class imbalance problem. In this paper, we propose a novel supervised ensemble learning approach, namely Clustering-Based Decision Forests, to tackle unbalanced class learning problem in seminal quality prediction. Experiment results on real fertility diagnosis dataset have shown that Clustering-Based Decision Forests outperforms decision tree, Support Vector Machines, random forests, multilayer perceptron neural networks and logistic regression by a noticeable margin. Clustering-Based Decision Forests can also be used to evaluate variables’ importance and the top five important factors that may affect semen concentration obtained in this study are age, serious trauma, sitting time, the season when the semen sample is produced, and high fevers in the last year. The findings could be helpful in explaining seminal concentration problems in infertile males or pre-screening semen donor candidates.

  13. A similarity based agglomerative clustering algorithm in networks

    Science.gov (United States)

    Liu, Zhiyuan; Wang, Xiujuan; Ma, Yinghong

    2018-04-01

    The detection of clusters is benefit for understanding the organizations and functions of networks. Clusters, or communities, are usually groups of nodes densely interconnected but sparsely linked with any other clusters. To identify communities, an efficient and effective community agglomerative algorithm based on node similarity is proposed. The proposed method initially calculates similarities between each pair of nodes, and form pre-partitions according to the principle that each node is in the same community as its most similar neighbor. After that, check each partition whether it satisfies community criterion. For the pre-partitions who do not satisfy, incorporate them with others that having the biggest attraction until there are no changes. To measure the attraction ability of a partition, we propose an attraction index that based on the linked node's importance in networks. Therefore, our proposed method can better exploit the nodes' properties and network's structure. To test the performance of our algorithm, both synthetic and empirical networks ranging in different scales are tested. Simulation results show that the proposed algorithm can obtain superior clustering results compared with six other widely used community detection algorithms.

  14. Persistent Spatial Clusters of Prescribed Antimicrobials among Danish Pig Farms - A Register-Based Study

    DEFF Research Database (Denmark)

    Fertner, Mette Ely; Sanchez, Javier; Boklund, Anette

    2015-01-01

    The emergence of pathogens resistant to antimicrobials has prompted political initiatives targeting a reduction in the use of veterinary antimicrobials in Denmark, especially for pigs. This study elucidates the tendency of pig farms with a significantly higher antimicrobial use to remain...... in clusters in certain geographical regions of Denmark. Animal Daily Doses/100 pigs/day were calculated for all three age groups of pigs (weaners, finishers and sows) for each quarter during 2012-13 in 6,143 commercial indoor pig producing farms. The data were split into four time periods of six months....... Repeated spatial cluster analyses were performed to identify persistent clusters, i.e. areas included in a significant cluster throughout all four time periods. Antimicrobials prescribed for weaners did not result in any persistent clusters. In contrast, antimicrobial use in finishers clustered...

  15. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  16. A Cluster- Based Secure Active Network Environment

    Institute of Scientific and Technical Information of China (English)

    CHEN Xiao-lin; ZHOU Jing-yang; DAI Han; LU Sang-lu; CHEN Gui-hai

    2005-01-01

    We introduce a cluster-based secure active network environment (CSANE) which separates the processing of IP packets from that of active packets in active routers. In this environment, the active code authorized or trusted by privileged users is executed in the secure execution environment (EE) of the active router, while others are executed in the secure EE of the nodes in the distributed shared memory (DSM) cluster. With the supports of a multi-process Java virtual machine and KeyNote, untrusted active packets are controlled to securely consume resource. The DSM consistency management makes that active packets can be parallelly processed in the DSM cluster as if they were processed one by one in ANTS (Active Network Transport System). We demonstrate that CSANE has good security and scalability, but imposing little changes on traditional routers.

  17. What Makes Clusters Decline?

    DEFF Research Database (Denmark)

    Østergaard, Christian Richter; Park, Eun Kyung

    2015-01-01

    Most studies on regional clusters focus on identifying factors and processes that make clusters grow. However, sometimes technologies and market conditions suddenly shift, and clusters decline. This paper analyses the process of decline of the wireless communication cluster in Denmark. The longit...... but being quick to withdraw in times of crisis....

  18. A Clustering-Based Automatic Transfer Function Design for Volume Visualization

    Directory of Open Access Journals (Sweden)

    Tianjin Zhang

    2016-01-01

    Full Text Available The two-dimensional transfer functions (TFs designed based on intensity-gradient magnitude (IGM histogram are effective tools for the visualization and exploration of 3D volume data. However, traditional design methods usually depend on multiple times of trial-and-error. We propose a novel method for the automatic generation of transfer functions by performing the affinity propagation (AP clustering algorithm on the IGM histogram. Compared with previous clustering algorithms that were employed in volume visualization, the AP clustering algorithm has much faster convergence speed and can achieve more accurate clustering results. In order to obtain meaningful clustering results, we introduce two similarity measurements: IGM similarity and spatial similarity. These two similarity measurements can effectively bring the voxels of the same tissue together and differentiate the voxels of different tissues so that the generated TFs can assign different optical properties to different tissues. Before performing the clustering algorithm on the IGM histogram, we propose to remove noisy voxels based on the spatial information of voxels. Our method does not require users to input the number of clusters, and the classification and visualization process is automatic and efficient. Experiments on various datasets demonstrate the effectiveness of the proposed method.

  19. A Cyber-Attack Detection Model Based on Multivariate Analyses

    Science.gov (United States)

    Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

    In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.

  20. DCE: A Distributed Energy-Efficient Clustering Protocol for Wireless Sensor Network Based on Double-Phase Cluster-Head Election.

    Science.gov (United States)

    Han, Ruisong; Yang, Wei; Wang, Yipeng; You, Kaiming

    2017-05-01

    Clustering is an effective technique used to reduce energy consumption and extend the lifetime of wireless sensor network (WSN). The characteristic of energy heterogeneity of WSNs should be considered when designing clustering protocols. We propose and evaluate a novel distributed energy-efficient clustering protocol called DCE for heterogeneous wireless sensor networks, based on a Double-phase Cluster-head Election scheme. In DCE, the procedure of cluster head election is divided into two phases. In the first phase, tentative cluster heads are elected with the probabilities which are decided by the relative levels of initial and residual energy. Then, in the second phase, the tentative cluster heads are replaced by their cluster members to form the final set of cluster heads if any member in their cluster has more residual energy. Employing two phases for cluster-head election ensures that the nodes with more energy have a higher chance to be cluster heads. Energy consumption is well-distributed in the proposed protocol, and the simulation results show that DCE achieves longer stability periods than other typical clustering protocols in heterogeneous scenarios.

  1. Semantic based cluster content discovery in description first clustering algorithm

    International Nuclear Information System (INIS)

    Khan, M.W.; Asif, H.M.S.

    2017-01-01

    In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm) is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing); an IR (Information Retrieval) technique for induction of meaningful labels for clusters and VSM (Vector Space Model) for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase. (author)

  2. Simulation-based marginal likelihood for cluster strong lensing cosmology

    Science.gov (United States)

    Killedar, M.; Borgani, S.; Fabjan, D.; Dolag, K.; Granato, G.; Meneghetti, M.; Planelles, S.; Ragone-Figueroa, C.

    2018-01-01

    Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with Λ cold dark matter cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate approximating the relevant Bayes factor using a marginal likelihood that is based on the following summary statistic: the posterior probability distribution function for the parameters of the scaling relation between Einstein radii and cluster mass, α and β. We demonstrate, for the first time, a method of estimating the marginal likelihood using the X-ray selected z > 0.5 Massive Cluster Survey clusters as a case in point and employing both N-body and hydrodynamic simulations of clusters. We investigate the uncertainty in this estimate and consequential ability to compare competing cosmologies, which arises from incomplete descriptions of baryonic processes, discrepancies in cluster selection criteria, redshift distribution and dynamical state. The relation between triaxial cluster masses at various overdensities provides a promising alternative to the strong lensing test.

  3. Collaborative Filtering Based on Sequential Extraction of User-Item Clusters

    Science.gov (United States)

    Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

    Collaborative filtering is a computational realization of “word-of-mouth” in network community, in which the items prefered by “neighbors” are recommended. This paper proposes a new item-selection model for extracting user-item clusters from rectangular relation matrices, in which mutual relations between users and items are denoted in an alternative process of “liking or not”. A technique for sequential co-cluster extraction from rectangular relational data is given by combining the structural balancing-based user-item clustering method with sequential fuzzy cluster extraction appraoch. Then, the tecunique is applied to the collaborative filtering problem, in which some items may be shared by several user clusters.

  4. INFRARED HIGH-RESOLUTION INTEGRATED LIGHT SPECTRAL ANALYSES OF M31 GLOBULAR CLUSTERS FROM APOGEE

    Energy Technology Data Exchange (ETDEWEB)

    Sakari, Charli M. [Department of Astronomy, University of Washington, Seattle WA 98195-1580 (United States); Shetrone, Matthew D. [McDonald Observatory, University of Texas at Austin, HC75 Box 1337-MCD, Fort Davis, TX 79734 (United States); Schiavon, Ricardo P. [Gemini Observatory, 670 N. A’Ohoku Place, Hilo, HI 96720 (United States); Bizyaev, Dmitry; Pan, Kaike [Apache Point Observatory and New Mexico State University, P.O. Box 59, Sunspot, NM, 88349-0059 (United States); Prieto, Carlos Allende; García-Hernández, Domingo Aníbal [Instituto de Astrofísica de Canarias (IAC), Va Lactea s/n, E-38205 La Laguna, Tenerife (Spain); Beers, Timothy C. [Department of Physics and JINA Center for the Evolution of the Elements, University of Notre Dame, Notre Dame, IN 46556 (United States); Caldwell, Nelson [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Lucatello, Sara [INAF Osservatorio Astronomico di Padova, Vicolo dellOsservatorio 5, I-35122 Padova (Italy); Majewski, Steven; O’Connell, Robert W. [Dept. of Astronomy, University of Virginia, Charlottesville, VA 22904-4325 (United States); Strader, Jay, E-mail: sakaricm@u.washington.edu [Department of Physics and Astronomy, Michigan State University, East Lansing, MI 48824 (United States)

    2016-10-01

    Chemical abundances are presented for 25 M31 globular clusters (GCs), based on moderately high resolution ( R = 22,500) H -band integrated light (IL) spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE). Infrared (IR) spectra offer lines from new elements, lines of different strengths, and lines at higher excitation potentials compared to the optical. Integrated abundances of C, N, and O are derived from CO, CN, and OH molecular features, while Fe, Na, Mg, Al, Si, K, Ca, and Ti abundances are derived from atomic features. These abundances are compared to previous results from the optical, demonstrating the validity and value of IR IL analyses. The CNO abundances are consistent with typical tip of the red giant branch stellar abundances but are systematically offset from optical Lick index abundances. With a few exceptions, the other abundances agree between the optical and the IR within the 1 σ uncertainties. The first integrated K abundances are also presented and demonstrate that K tracks the α elements. The combination of IR and optical abundances allows better determinations of GC properties and enables probes of the multiple populations in extragalactic GCs. In particular, the integrated effects of the Na/O anticorrelation can be directly examined for the first time.

  5. Deletion and Gene Expression Analyses Define the Paxilline Biosynthetic Gene Cluster in Penicillium paxilli

    Directory of Open Access Journals (Sweden)

    Emily J. Parker

    2013-08-01

    Full Text Available The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse. This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis.

  6. Exploring spatial evolution of economic clusters: A case study of Beijing

    Science.gov (United States)

    Yang, Zhenshan; Sliuzas, Richard; Cai, Jianming; Ottens, Henk F. L.

    2012-10-01

    An identification of economic clusters and analysing their changing spatial patterns is important for understanding urban economic space dynamics. Previous studies, however, suffer from limitations as a consequence of using fixed geographically areas and not combining functional and spatial dynamics. The paper presents an approach, based on local spatial statistics and the case of Beijing to understand the spatial clustering of industries that are functionally interconnected by common or complementary patterns of demand or supply relations. Using register data of business establishments, it identifies economic clusters and analyses their pattern based on postcodes at different time slices during the period 1983-2002. The study shows how the advanced services occupy the urban centre and key sub centres. The Information and Communication Technology (ICT) cluster is mainly concentrated in the north part of the city and circles the urban centre, and the main manufacturing clusters are evolved in the key sub centers. This type of outcomes improves understanding of urban-economic dynamics, which can support spatial and economic planning.

  7. INTERNATIONAL BEHAVIOUR AND PERFORMANCE BASED ROMANIAN ENTREPRENEURIAL AND TRADITIONAL FIRM CLUSTERS

    Directory of Open Access Journals (Sweden)

    FEDER Emoke - Szidonia

    2015-07-01

    Full Text Available The micro, small and medium-sized firms (SMEs present a key interest at European level due to their potential positive influence on regional, national and firm level competitiveness. At a certain moment in time, internationalisation became an expected and even unavoidable strategy in firms’ future development, growth and evolution. From theoretical perspective, an integrative complementarily approach is adopted concerning the dominant paradigm of stage models from incremental internationalisation theory and the emergent paradigm of international entrepreneurship theory. Several researcher calls for empirical testing of different theoretical frameworks and international firms. Therefore, the first aim of the quantitative study is to empirically prove, the existence of various internationalisation behaviour configuration based clusters, like sporadic and traditional international firms, born-again global and born global firms, within the framework of Romanian SMEs. Secondly, within the research framework the study propose to assess different distinguishing internationalisation behavioural characteristics and patterns for the delimited clusters, in terms of foreign market scope, internationalisation pace and rhythm, initial and current entry modes, international product portfolio and commitment. Thirdly, internationalisation cluster membership and patterns differential influence and contribution is analysed on firm level international business performance, as internationalisation degree, financial and marketing measures. The framework was tested on a transversal sample consisting of 140 Romanian internationalised SMEs. Findings are especially useful for entrepreneurs and SME managers presenting various decisional possibilities and options on internationalisation behaviours and performance. These emphasize the importance of internationalisation scope, pace, object and opportunity seeking, along with positive influence on performance, indifferent

  8. Non-Hierarchical Clustering as a method to analyse an open-ended ...

    African Journals Online (AJOL)

    Apple

    Keywords: algebraic thinking; cluster analysis; mathematics education; quantitative analysis. Introduction. Extensive ..... C1, C2 and C3 represent the three centroids of the three clusters formed. .... 6ALd. All these strategies are algebraic and 'high- ... 1995), of the didactical aspects related to teaching .... Brazil, 18-23 July.

  9. Information Clustering Based on Fuzzy Multisets.

    Science.gov (United States)

    Miyamoto, Sadaaki

    2003-01-01

    Proposes a fuzzy multiset model for information clustering with application to information retrieval on the World Wide Web. Highlights include search engines; term clustering; document clustering; algorithms for calculating cluster centers; theoretical properties concerning clustering algorithms; and examples to show how the algorithms work.…

  10. Parallel Density-Based Clustering for Discovery of Ionospheric Phenomena

    Science.gov (United States)

    Pankratius, V.; Gowanlock, M.; Blair, D. M.

    2015-12-01

    Ionospheric total electron content maps derived from global networks of dual-frequency GPS receivers can reveal a plethora of ionospheric features in real-time and are key to space weather studies and natural hazard monitoring. However, growing data volumes from expanding sensor networks are making manual exploratory studies challenging. As the community is heading towards Big Data ionospheric science, automation and Computer-Aided Discovery become indispensable tools for scientists. One problem of machine learning methods is that they require domain-specific adaptations in order to be effective and useful for scientists. Addressing this problem, our Computer-Aided Discovery approach allows scientists to express various physical models as well as perturbation ranges for parameters. The search space is explored through an automated system and parallel processing of batched workloads, which finds corresponding matches and similarities in empirical data. We discuss density-based clustering as a particular method we employ in this process. Specifically, we adapt Density-Based Spatial Clustering of Applications with Noise (DBSCAN). This algorithm groups geospatial data points based on density. Clusters of points can be of arbitrary shape, and the number of clusters is not predetermined by the algorithm; only two input parameters need to be specified: (1) a distance threshold, (2) a minimum number of points within that threshold. We discuss an implementation of DBSCAN for batched workloads that is amenable to parallelization on manycore architectures such as Intel's Xeon Phi accelerator with 60+ general-purpose cores. This manycore parallelization can cluster large volumes of ionospheric total electronic content data quickly. Potential applications for cluster detection include the visualization, tracing, and examination of traveling ionospheric disturbances or other propagating phenomena. Acknowledgments. We acknowledge support from NSF ACI-1442997 (PI V. Pankratius).

  11. Voxel-based clustered imaging by multiparameter diffusion tensor images for glioma grading.

    Science.gov (United States)

    Inano, Rika; Oishi, Naoya; Kunieda, Takeharu; Arakawa, Yoshiki; Yamao, Yukihiro; Shibata, Sumiya; Kikuchi, Takayuki; Fukuyama, Hidenao; Miyamoto, Susumu

    2014-01-01

    Gliomas are the most common intra-axial primary brain tumour; therefore, predicting glioma grade would influence therapeutic strategies. Although several methods based on single or multiple parameters from diagnostic images exist, a definitive method for pre-operatively determining glioma grade remains unknown. We aimed to develop an unsupervised method using multiple parameters from pre-operative diffusion tensor images for obtaining a clustered image that could enable visual grading of gliomas. Fourteen patients with low-grade gliomas and 19 with high-grade gliomas underwent diffusion tensor imaging and three-dimensional T1-weighted magnetic resonance imaging before tumour resection. Seven features including diffusion-weighted imaging, fractional anisotropy, first eigenvalue, second eigenvalue, third eigenvalue, mean diffusivity and raw T2 signal with no diffusion weighting, were extracted as multiple parameters from diffusion tensor imaging. We developed a two-level clustering approach for a self-organizing map followed by the K-means algorithm to enable unsupervised clustering of a large number of input vectors with the seven features for the whole brain. The vectors were grouped by the self-organizing map as protoclusters, which were classified into the smaller number of clusters by K-means to make a voxel-based diffusion tensor-based clustered image. Furthermore, we also determined if the diffusion tensor-based clustered image was really helpful for predicting pre-operative glioma grade in a supervised manner. The ratio of each class in the diffusion tensor-based clustered images was calculated from the regions of interest manually traced on the diffusion tensor imaging space, and the common logarithmic ratio scales were calculated. We then applied support vector machine as a classifier for distinguishing between low- and high-grade gliomas. Consequently, the sensitivity, specificity, accuracy and area under the curve of receiver operating characteristic

  12. Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

    Science.gov (United States)

    van Passel, Mark Wj

    2011-09-01

    Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

  13. An Adaptive Sweep-Circle Spatial Clustering Algorithm Based on Gestalt

    Directory of Open Access Journals (Sweden)

    Qingming Zhan

    2017-08-01

    Full Text Available An adaptive spatial clustering (ASC algorithm is proposed in this present study, which employs sweep-circle techniques and a dynamic threshold setting based on the Gestalt theory to detect spatial clusters. The proposed algorithm can automatically discover clusters in one pass, rather than through the modification of the initial model (for example, a minimal spanning tree, Delaunay triangulation, or Voronoi diagram. It can quickly identify arbitrarily-shaped clusters while adapting efficiently to non-homogeneous density characteristics of spatial data, without the need for prior knowledge or parameters. The proposed algorithm is also ideal for use in data streaming technology with dynamic characteristics flowing in the form of spatial clustering in large data sets.

  14. Cross-layer cluster-based energy-efficient protocol for wireless sensor networks.

    Science.gov (United States)

    Mammu, Aboobeker Sidhik Koyamparambil; Hernandez-Jayo, Unai; Sainz, Nekane; de la Iglesia, Idoia

    2015-04-09

    Recent developments in electronics and wireless communications have enabled the improvement of low-power and low-cost wireless sensors networks (WSNs). One of the most important challenges in WSNs is to increase the network lifetime due to the limited energy capacity of the network nodes. Another major challenge in WSNs is the hot spots that emerge as locations under heavy traffic load. Nodes in such areas quickly drain energy resources, leading to disconnection in network services. In such an environment, cross-layer cluster-based energy-efficient algorithms (CCBE) can prolong the network lifetime and energy efficiency. CCBE is based on clustering the nodes to different hexagonal structures. A hexagonal cluster consists of cluster members (CMs) and a cluster head (CH). The CHs are selected from the CMs based on nodes near the optimal CH distance and the residual energy of the nodes. Additionally, the optimal CH distance that links to optimal energy consumption is derived. To balance the energy consumption and the traffic load in the network, the CHs are rotated among all CMs. In WSNs, energy is mostly consumed during transmission and reception. Transmission collisions can further decrease the energy efficiency. These collisions can be avoided by using a contention-free protocol during the transmission period. Additionally, the CH allocates slots to the CMs based on their residual energy to increase sleep time. Furthermore, the energy consumption of CH can be further reduced by data aggregation. In this paper, we propose a data aggregation level based on the residual energy of CH and a cost-aware decision scheme for the fusion of data. Performance results show that the CCBE scheme performs better in terms of network lifetime, energy consumption and throughput compared to low-energy adaptive clustering hierarchy (LEACH) and hybrid energy-efficient distributed clustering (HEED).

  15. Management of Energy Consumption on Cluster Based Routing Protocol for MANET

    Science.gov (United States)

    Hosseini-Seno, Seyed-Amin; Wan, Tat-Chee; Budiarto, Rahmat; Yamada, Masashi

    The usage of light-weight mobile devices is increasing rapidly, leading to demand for more telecommunication services. Consequently, mobile ad hoc networks and their applications have become feasible with the proliferation of light-weight mobile devices. Many protocols have been developed to handle service discovery and routing in ad hoc networks. However, the majority of them did not consider one critical aspect of this type of network, which is the limited of available energy in each node. Cluster Based Routing Protocol (CBRP) is a robust/scalable routing protocol for Mobile Ad hoc Networks (MANETs) and superior to existing protocols such as Ad hoc On-demand Distance Vector (AODV) in terms of throughput and overhead. Therefore, based on this strength, methods to increase the efficiency of energy usage are incorporated into CBRP in this work. In order to increase the stability (in term of life-time) of the network and to decrease the energy consumption of inter-cluster gateway nodes, an Enhanced Gateway Cluster Based Routing Protocol (EGCBRP) is proposed. Three methods have been introduced by EGCBRP as enhancements to the CBRP: improving the election of cluster Heads (CHs) in CBRP which is based on the maximum available energy level, implementing load balancing for inter-cluster traffic using multiple gateways, and implementing sleep state for gateway nodes to further save the energy. Furthermore, we propose an Energy Efficient Cluster Based Routing Protocol (EECBRP) which extends the EGCBRP sleep state concept into all idle member nodes, excluding the active nodes in all clusters. The experiment results show that the EGCBRP decreases the overall energy consumption of the gateway nodes up to 10% and the EECBRP reduces the energy consumption of the member nodes up to 60%, both of which in turn contribute to stabilizing the network.

  16. Cross-Layer Cluster-Based Energy-Efficient Protocol for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Aboobeker Sidhik Koyamparambil Mammu

    2015-04-01

    Full Text Available Recent developments in electronics and wireless communications have enabled the improvement of low-power and low-cost wireless sensors networks (WSNs. One of the most important challenges in WSNs is to increase the network lifetime due to the limited energy capacity of the network nodes. Another major challenge in WSNs is the hot spots that emerge as locations under heavy traffic load. Nodes in such areas quickly drain energy resources, leading to disconnection in network services. In such an environment, cross-layer cluster-based energy-efficient algorithms (CCBE can prolong the network lifetime and energy efficiency. CCBE is based on clustering the nodes to different hexagonal structures. A hexagonal cluster consists of cluster members (CMs and a cluster head (CH. The CHs are selected from the CMs based on nodes near the optimal CH distance and the residual energy of the nodes. Additionally, the optimal CH distance that links to optimal energy consumption is derived. To balance the energy consumption and the traffic load in the network, the CHs are rotated among all CMs. In WSNs, energy is mostly consumed during transmission and reception. Transmission collisions can further decrease the energy efficiency. These collisions can be avoided by using a contention-free protocol during the transmission period. Additionally, the CH allocates slots to the CMs based on their residual energy to increase sleep time. Furthermore, the energy consumption of CH can be further reduced by data aggregation. In this paper, we propose a data aggregation level based on the residual energy of CH and a cost-aware decision scheme for the fusion of data. Performance results show that the CCBE scheme performs better in terms of network lifetime, energy consumption and throughput compared to low-energy adaptive clustering hierarchy (LEACH and hybrid energy-efficient distributed clustering (HEED.

  17. Construction and application of Red5 cluster based on OpenStack

    Science.gov (United States)

    Wang, Jiaqing; Song, Jianxin

    2017-08-01

    With the application and development of cloud computing technology in various fields, the resource utilization rate of the data center has been improved obviously, and the system based on cloud computing platform has also improved the expansibility and stability. In the traditional way, Red5 cluster resource utilization is low and the system stability is poor. This paper uses cloud computing to efficiently calculate the resource allocation ability, and builds a Red5 server cluster based on OpenStack. Multimedia applications can be published to the Red5 cloud server cluster. The system achieves the flexible construction of computing resources, but also greatly improves the stability of the cluster and service efficiency.

  18. Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.

    Science.gov (United States)

    Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan

    2015-11-01

    Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  19. Multi scales based sparse matrix spectral clustering image segmentation

    Science.gov (United States)

    Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin

    2018-04-01

    In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.

  20. Agent-based method for distributed clustering of textual information

    Science.gov (United States)

    Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN

    2010-09-28

    A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.

  1. A novel clustering algorithm based on quantum games

    International Nuclear Information System (INIS)

    Li Qiang; He Yan; Jiang Jingping

    2009-01-01

    Enormous successes have been made by quantum algorithms during the last decade. In this paper, we combine the quantum game with the problem of data clustering, and then develop a quantum-game-based clustering algorithm, in which data points in a dataset are considered as players who can make decisions and implement quantum strategies in quantum games. After each round of a quantum game, each player's expected payoff is calculated. Later, he uses a link-removing-and-rewiring (LRR) function to change his neighbors and adjust the strength of links connecting to them in order to maximize his payoff. Further, algorithms are discussed and analyzed in two cases of strategies, two payoff matrixes and two LRR functions. Consequently, the simulation results have demonstrated that data points in datasets are clustered reasonably and efficiently, and the clustering algorithms have fast rates of convergence. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the proposed approach.

  2. Green Clustering Implementation Based on DPS-MOPSO

    Directory of Open Access Journals (Sweden)

    Yang Lu

    2014-01-01

    Full Text Available A green clustering implementation is proposed to be as the first method in the framework of an energy-efficient strategy for centralized enterprise high-density WLANs. Traditionally, to maintain the network coverage, all of the APs within the WLAN have to be powered on. Nevertheless, the new algorithm can power off a large proportion of APs while the coverage is maintained as the always-on counterpart. The proposed algorithm is composed of two parallel and concurrent procedures, which are the faster procedure based on K-means and the more accurate procedure based on Dynamic Population Size Multiple Objective Particle Swarm Optimization (DPS-MOPSO. To implement green clustering efficiently and accurately, dynamic population size and mutational operators are introduced as complements for the classical MOPSO. In addition to the function of AP selection, the new green clustering algorithm has another new function as the reference and guidance for AP deployment. This paper also presents simulations in scenarios modeled with ray-tracing method and FDTD technique, and the results show that about 67% up to 90% of energy consumption can be saved while the original network coverage is maintained during periods when few users are online or when the traffic load is low.

  3. A quasiparticle-based multi-reference coupled-cluster method.

    Science.gov (United States)

    Rolik, Zoltán; Kállay, Mihály

    2014-10-07

    The purpose of this paper is to introduce a quasiparticle-based multi-reference coupled-cluster (MRCC) approach. The quasiparticles are introduced via a unitary transformation which allows us to represent a complete active space reference function and other elements of an orthonormal multi-reference (MR) basis in a determinant-like form. The quasiparticle creation and annihilation operators satisfy the fermion anti-commutation relations. On the basis of these quasiparticles, a generalization of the normal-ordered operator products for the MR case can be introduced as an alternative to the approach of Mukherjee and Kutzelnigg [Recent Prog. Many-Body Theor. 4, 127 (1995); Mukherjee and Kutzelnigg, J. Chem. Phys. 107, 432 (1997)]. Based on the new normal ordering any quasiparticle-based theory can be formulated using the well-known diagram techniques. Beyond the general quasiparticle framework we also present a possible realization of the unitary transformation. The suggested transformation has an exponential form where the parameters, holding exclusively active indices, are defined in a form similar to the wave operator of the unitary coupled-cluster approach. The definition of our quasiparticle-based MRCC approach strictly follows the form of the single-reference coupled-cluster method and retains several of its beneficial properties. Test results for small systems are presented using a pilot implementation of the new approach and compared to those obtained by other MR methods.

  4. Formal And Informal Macro-Regional Transport Clusters As A Primary Step In The Design And Implementation Of Cluster-Based Strategies

    Directory of Open Access Journals (Sweden)

    Nežerenko Olga

    2015-09-01

    Full Text Available The aim of the study is the identification of a formal macro-regional transport and logistics cluster and its development trends on a macro-regional level in 2007-2011 by means of the hierarchical cluster analysis. The central approach of the study is based on two concepts: 1 the concept of formal and informal macro-regions, and 2 the concept of clustering which is based on the similarities shared by the countries of a macro-region and tightly related to the concept of macro-region. The authors seek to answer the question whether the formation of a formal transport cluster could provide the BSR a stable competitive position in the global transportation and logistics market.

  5. Intracluster age gradients in numerous young stellar clusters

    Science.gov (United States)

    Getman, K. V.; Feigelson, E. D.; Kuhn, M. A.; Bate, M. R.; Broos, P. S.; Garmire, G. P.

    2018-05-01

    The pace and pattern of star formation leading to rich young stellar clusters is quite uncertain. In this context, we analyse the spatial distribution of ages within 19 young (median t ≲ 3 Myr on the Siess et al. time-scale), morphologically simple, isolated, and relatively rich stellar clusters. Our analysis is based on young stellar object (YSO) samples from the Massive Young Star-Forming Complex Study in Infrared and X-ray and Star Formation in Nearby Clouds surveys, and a new estimator of pre-main sequence (PMS) stellar ages, AgeJX, derived from X-ray and near-infrared photometric data. Median cluster ages are computed within four annular subregions of the clusters. We confirm and extend the earlier result of Getman et al. (2014): 80 per cent of the clusters show age trends where stars in cluster cores are younger than in outer regions. Our cluster stacking analyses establish the existence of an age gradient to high statistical significance in several ways. Time-scales vary with the choice of PMS evolutionary model; the inferred median age gradient across the studied clusters ranges from 0.75 to 1.5 Myr pc-1. The empirical finding reported in the present study - late or continuing formation of stars in the cores of star clusters with older stars dispersed in the outer regions - has a strong foundation with other observational studies and with the astrophysical models like the global hierarchical collapse model of Vázquez-Semadeni et al.

  6. Adaptive density trajectory cluster based on time and space distance

    Science.gov (United States)

    Liu, Fagui; Zhang, Zhijie

    2017-10-01

    There are some hotspot problems remaining in trajectory cluster for discovering mobile behavior regularity, such as the computation of distance between sub trajectories, the setting of parameter values in cluster algorithm and the uncertainty/boundary problem of data set. As a result, based on the time and space, this paper tries to define the calculation method of distance between sub trajectories. The significance of distance calculation for sub trajectories is to clearly reveal the differences in moving trajectories and to promote the accuracy of cluster algorithm. Besides, a novel adaptive density trajectory cluster algorithm is proposed, in which cluster radius is computed through using the density of data distribution. In addition, cluster centers and number are selected by a certain strategy automatically, and uncertainty/boundary problem of data set is solved by designed weighted rough c-means. Experimental results demonstrate that the proposed algorithm can perform the fuzzy trajectory cluster effectively on the basis of the time and space distance, and obtain the optimal cluster centers and rich cluster results information adaptably for excavating the features of mobile behavior in mobile and sociology network.

  7. Analysing the spatial patterns of livestock anthrax in Kazakhstan in relation to environmental factors: a comparison of local (Gi* and morphology cluster statistics

    Directory of Open Access Journals (Sweden)

    Ian T. Kracalik

    2012-11-01

    Full Text Available We compared a local clustering and a cluster morphology statistic using anthrax outbreaks in large (cattle and small (sheep and goats domestic ruminants across Kazakhstan. The Getis-Ord (Gi* statistic and a multidirectional optimal ecotope algorithm (AMOEBA were compared using 1st, 2nd and 3rd order Rook contiguity matrices. Multivariate statistical tests were used to evaluate the environmental signatures between clusters and non-clusters from the AMOEBA and Gi* tests. A logistic regression was used to define a risk surface for anthrax outbreaks and to compare agreement between clustering methodologies. Tests revealed differences in the spatial distribution of clusters as well as the total number of clusters in large ruminants for AMOEBA (n = 149 and for small ruminants (n = 9. In contrast, Gi* revealed fewer large ruminant clusters (n = 122 and more small ruminant clusters (n = 61. Significant environmental differences were found between groups using the Kruskall-Wallis and Mann- Whitney U tests. Logistic regression was used to model the presence/absence of anthrax outbreaks and define a risk surface for large ruminants to compare with cluster analyses. The model predicted 32.2% of the landscape as high risk. Approximately 75% of AMOEBA clusters corresponded to predicted high risk, compared with ~64% of Gi* clusters. In general, AMOEBA predicted more irregularly shaped clusters of outbreaks in both livestock groups, while Gi* tended to predict larger, circular clusters. Here we provide an evaluation of both tests and a discussion of the use of each to detect environmental conditions associated with anthrax outbreak clusters in domestic livestock. These findings illustrate important differences in spatial statistical methods for defining local clusters and highlight the importance of selecting appropriate levels of data aggregation.

  8. Personalized PageRank Clustering: A graph clustering algorithm based on random walks

    Science.gov (United States)

    A. Tabrizi, Shayan; Shakery, Azadeh; Asadpour, Masoud; Abbasi, Maziar; Tavallaie, Mohammad Ali

    2013-11-01

    Graph clustering has been an essential part in many methods and thus its accuracy has a significant effect on many applications. In addition, exponential growth of real-world graphs such as social networks, biological networks and electrical circuits demands clustering algorithms with nearly-linear time and space complexity. In this paper we propose Personalized PageRank Clustering (PPC) that employs the inherent cluster exploratory property of random walks to reveal the clusters of a given graph. We combine random walks and modularity to precisely and efficiently reveal the clusters of a graph. PPC is a top-down algorithm so it can reveal inherent clusters of a graph more accurately than other nearly-linear approaches that are mainly bottom-up. It also gives a hierarchy of clusters that is useful in many applications. PPC has a linear time and space complexity and has been superior to most of the available clustering algorithms on many datasets. Furthermore, its top-down approach makes it a flexible solution for clustering problems with different requirements.

  9. Novel Clustering Method Based on K-Medoids and Mobility Metric

    Directory of Open Access Journals (Sweden)

    Y. Hamzaoui

    2018-06-01

    Full Text Available The structure and constraint of MANETS influence negatively the performance of QoS, moreover the main routing protocols proposed generally operate in flat routing. Hence, this structure gives the bad results of QoS when the network becomes larger and denser. To solve this problem we use one of the most popular methods named clustering. The present paper comes within the frameworks of research to improve the QoS in MANETs. In this paper we propose a new algorithm of clustering based on the new mobility metric and K-Medoid to distribute the nodes into several clusters. Intuitively our algorithm can give good results in terms of stability of the cluster, and can also extend life time of cluster head.

  10. Clustering economies based on multiple criteria decision making techniques

    Directory of Open Access Journals (Sweden)

    Mansour Momeni

    2011-10-01

    Full Text Available One of the primary concerns on many countries is to determine different important factors affecting economic growth. In this paper, we study some factors such as unemployment rate, inflation ratio, population growth, average annual income, etc to cluster different countries. The proposed model of this paper uses analytical hierarchy process (AHP to prioritize the criteria and then uses a K-mean technique to cluster 59 countries based on the ranked criteria into four groups. The first group includes countries with high standards such as Germany and Japan. In the second cluster, there are some developing countries with relatively good economic growth such as Saudi Arabia and Iran. The third cluster belongs to countries with faster rates of growth compared with the countries located in the second group such as China, India and Mexico. Finally, the fourth cluster includes countries with relatively very low rates of growth such as Jordan, Mali, Niger, etc.

  11. A New Swarm Intelligence Approach for Clustering Based on Krill Herd with Elitism Strategy

    Directory of Open Access Journals (Sweden)

    Zhi-Yong Li

    2015-10-01

    Full Text Available As one of the most popular and well-recognized clustering methods, fuzzy C-means (FCM clustering algorithm is the basis of other fuzzy clustering analysis methods in theory and application respects. However, FCM algorithm is essentially a local search optimization algorithm. Therefore, sometimes, it may fail to find the global optimum. For the purpose of getting over the disadvantages of FCM algorithm, a new version of the krill herd (KH algorithm with elitism strategy, called KHE, is proposed to solve the clustering problem. Elitism tragedy has a strong ability of preventing the krill population from degrading. In addition, the well-selected parameters are used in the KHE method instead of originating from nature. Through an array of simulation experiments, the results show that the KHE is indeed a good choice for solving general benchmark problems and fuzzy clustering analyses.

  12. Thermodynamically accessible titanium clusters TiN, N = 2-32.

    Science.gov (United States)

    Lazauskas, Tomas; Sokol, Alexey A; Buckeridge, John; Catlow, C Richard A; Escher, Susanne G E T; Farrow, Matthew R; Mora-Fonz, David; Blum, Volker W; Phaahla, Tshegofatso M; Chauke, Hasani R; Ngoepe, Phuti E; Woodley, Scott M

    2018-05-10

    We have performed a genetic algorithm search on the tight-binding interatomic potential energy surface (PES) for small TiN (N = 2-32) clusters. The low energy candidate clusters were further refined using density functional theory (DFT) calculations with the PBEsol exchange-correlation functional and evaluated with the PBEsol0 hybrid functional. The resulting clusters were analysed in terms of their structural features, growth mechanism and surface area. The results suggest a growth mechanism that is based on forming coordination centres by interpenetrating icosahedra, icositetrahedra and Frank-Kasper polyhedra. We identify centres of coordination, which act as centres of bulk nucleation in medium sized clusters and determine the morphological features of the cluster.

  13. Unsupervised active learning based on hierarchical graph-theoretic clustering.

    Science.gov (United States)

    Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve

    2009-10-01

    Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.

  14. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  15. A Cluster-based Approach Towards Detecting and Modeling Network Dictionary Attacks

    Directory of Open Access Journals (Sweden)

    A. Tajari Siahmarzkooh

    2016-12-01

    Full Text Available In this paper, we provide an approach to detect network dictionary attacks using a data set collected as flows based on which a clustered graph is resulted. These flows provide an aggregated view of the network traffic in which the exchanged packets in the network are considered so that more internally connected nodes would be clustered. We show that dictionary attacks could be detected through some parameters namely the number and the weight of clusters in time series and their evolution over the time. Additionally, the Markov model based on the average weight of clusters,will be also created. Finally, by means of our suggested model, we demonstrate that artificial clusters of the flows are created for normal and malicious traffic. The results of the proposed approach on CAIDA 2007 data set suggest a high accuracy for the model and, therefore, it provides a proper method for detecting the dictionary attack.

  16. PROSPECTS OF THE REGIONAL INTEGRATION POLICY BASED ON CLUSTER FORMATION

    Directory of Open Access Journals (Sweden)

    Elena Tsepilova

    2018-01-01

    Full Text Available The purpose of this article is to develop the theoretical foundations of regional integration policy and to determine its prospects on the basis of cluster formation. The authors use such research methods as systematization, comparative and complex analysis, synthesis, statistical method. Within the framework of the research, the concept of regional integration policy is specified, and its integration core – cluster – is allocated. The authors work out an algorithm of regional clustering, which will ensure the growth of economy and tax income. Measures have been proposed to optimize the organizational mechanism of interaction between the participants of the territorial cluster and the authorities that allow to ensure the effective functioning of clusters, including taxation clusters. Based on the results of studying the existing methods for assessing the effectiveness of cluster policy, the authors propose their own approach to evaluating the consequences of implementing the regional integration policy, according to which the list of quantitative and qualitative indicators is defined. The present article systematizes the experience and results of the cluster policy of certain European countries, that made it possible to determine the prospects and synergetic effect from the development of clusters as an integration foundation of regional policy in the Russian Federation. The authors carry out the analysis of activity of cluster formations using the example of the Rostov region – a leader in the formation of conditions for the cluster policy development in the Southern Federal District. 11 clusters and cluster initiatives are developing in this region. As a result, the authors propose measures for support of the already existing clusters and creation of the new ones.

  17. TRUSTWORTHY OPTIMIZED CLUSTERING BASED TARGET DETECTION AND TRACKING FOR WIRELESS SENSOR NETWORK

    Directory of Open Access Journals (Sweden)

    C. Jehan

    2016-06-01

    Full Text Available In this paper, an efficient approach is proposed to address the problem of target tracking in wireless sensor network (WSN. The problem being tackled here uses adaptive dynamic clustering scheme for tracking the target. It is a specific problem in object tracking. The proposed adaptive dynamic clustering target tracking scheme uses three steps for target tracking. The first step deals with the identification of clusters and cluster heads using OGSAFCM. Here, kernel fuzzy c-means (KFCM and gravitational search algorithm (GSA are combined to create clusters. At first, oppositional gravitational search algorithm (OGSA is used to optimize the initial clustering center and then the KFCM algorithm is availed to guide the classification and the cluster formation process. In the OGSA, the concept of the opposition based population initialization in the basic GSA to improve the convergence profile. The identified clusters are changed dynamically. The second step deals with the data transmission to the cluster heads. The third step deals with the transmission of aggregated data to the base station as well as the detection of target. From the experimental results, the proposed scheme efficiently and efficiently identifies the target. As a result the tracking error is minimized.

  18. Dynamic Characteristics Analysis and Stabilization of PV-Based Multiple Microgrid Clusters

    DEFF Research Database (Denmark)

    Zhao, Zhuoli; Yang, Ping; Wang, Yuewu

    2018-01-01

    -based multiple microgrid clusters. A detailed small-signal model for PV-based microgrid clusters considering local adaptive dynamic droop control mechanism of the voltage-source PV system is developed. The complete dynamic model is then used to access and compare the dynamic characteristics of the single...... microgrid and interconnected microgrids. In order to enhance system stability of the PV microgrid clusters, a tie-line flow and stabilization strategy is proposed to suppress the introduced interarea and local oscillations. Robustly selecting of the key control parameters is transformed to a multiobjective......As the penetration of PV generation increases, there is a growing operational demand on PV systems to participate in microgrid frequency regulation. It is expected that future distribution systems will consist of multiple microgrid clusters. However, interconnecting PV microgrids may lead to system...

  19. A novel approach to dynamic livelihood clustering

    DEFF Research Database (Denmark)

    Walelign, Solomon Zena; Pouliot, Mariéve; Larsen, Helle Overgaard

    -wave panel dataset from 427 households in three locations of Nepal, we proposed an approach that combines households’ income and assets to identify different livelihood strategy clusters. Based on a Latent Markov Model we identify seven distinct livelihood strategies and analyse households’ movements between...

  20. An improved initialization center k-means clustering algorithm based on distance and density

    Science.gov (United States)

    Duan, Yanling; Liu, Qun; Xia, Shuyin

    2018-04-01

    Aiming at the problem of the random initial clustering center of k means algorithm that the clustering results are influenced by outlier data sample and are unstable in multiple clustering, a method of central point initialization method based on larger distance and higher density is proposed. The reciprocal of the weighted average of distance is used to represent the sample density, and the data sample with the larger distance and the higher density are selected as the initial clustering centers to optimize the clustering results. Then, a clustering evaluation method based on distance and density is designed to verify the feasibility of the algorithm and the practicality, the experimental results on UCI data sets show that the algorithm has a certain stability and practicality.

  1. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  2. Inhomogeneity of epidemic spreading with entropy-based infected clusters.

    Science.gov (United States)

    Wen-Jie, Zhou; Xing-Yuan, Wang

    2013-12-01

    Considering the difference in the sizes of the infected clusters in the dynamic complex networks, the normalized entropy based on infected clusters (δ*) is proposed to characterize the inhomogeneity of epidemic spreading. δ* gives information on the variability of the infected clusters in the system. We investigate the variation in the inhomogeneity of the distribution of the epidemic with the absolute velocity v of moving agent, the infection density ρ, and the interaction radius r. By comparing δ* in the dynamic networks with δH* in homogeneous mode, the simulation experiments show that the inhomogeneity of epidemic spreading becomes smaller with the increase of v, ρ, r.

  3. Ionized-cluster source based on high-pressure corona discharge

    International Nuclear Information System (INIS)

    Lokuliyanage, K.; Huber, D.; Zappa, F.; Scheier, P.

    2006-01-01

    Full text: It has been demonstrated that energetic beams of large clusters, with thousands of atoms, can be a powerful tool for surface modification. Normally ionized cluster beams are obtained by electron impact on neutral beams produced in a supersonic expansion. At the University of Innsbruck we are pursuing the realization of a high current cluster ion source based on the corona discharge.The idea in the present case is that the ionization should occur prior to the supersonic expansion, thus supersede the need of subsequent electron impact. In this contribution we present the project of our source in its initial stage. The intensity distribution of cluster sizes as a function of the source parameters, such as input pressure, temperature and gap voltage, are investigated with the aid of a custom-built time of flight mass spectrometer. (author)

  4. COMPARISON AND EVALUATION OF CLUSTER BASED IMAGE SEGMENTATION TECHNIQUES

    OpenAIRE

    Hetangi D. Mehta*, Daxa Vekariya, Pratixa Badelia

    2017-01-01

    Image segmentation is the classification of an image into different groups. Numerous algorithms using different approaches have been proposed for image segmentation. A major challenge in segmentation evaluation comes from the fundamental conflict between generality and objectivity. A review is done on different types of clustering methods used for image segmentation. Also a methodology is proposed to classify and quantify different clustering algorithms based on their consistency in different...

  5. A Coupled User Clustering Algorithm Based on Mixed Data for Web-Based Learning Systems

    Directory of Open Access Journals (Sweden)

    Ke Niu

    2015-01-01

    Full Text Available In traditional Web-based learning systems, due to insufficient learning behaviors analysis and personalized study guides, a few user clustering algorithms are introduced. While analyzing the behaviors with these algorithms, researchers generally focus on continuous data but easily neglect discrete data, each of which is generated from online learning actions. Moreover, there are implicit coupled interactions among the data but are frequently ignored in the introduced algorithms. Therefore, a mass of significant information which can positively affect clustering accuracy is neglected. To solve the above issues, we proposed a coupled user clustering algorithm for Wed-based learning systems by taking into account both discrete and continuous data, as well as intracoupled and intercoupled interactions of the data. The experiment result in this paper demonstrates the outperformance of the proposed algorithm.

  6. Model-based Clustering of Categorical Time Series with Multinomial Logit Classification

    Science.gov (United States)

    Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea

    2010-09-01

    A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.

  7. Predictor-Year Subspace Clustering Based Ensemble Prediction of Indian Summer Monsoon

    Directory of Open Access Journals (Sweden)

    Moumita Saha

    2016-01-01

    Full Text Available Forecasting the Indian summer monsoon is a challenging task due to its complex and nonlinear behavior. A large number of global climatic variables with varying interaction patterns over years influence monsoon. Various statistical and neural prediction models have been proposed for forecasting monsoon, but many of them fail to capture variability over years. The skill of predictor variables of monsoon also evolves over time. In this article, we propose a joint-clustering of monsoon years and predictors for understanding and predicting the monsoon. This is achieved by subspace clustering algorithm. It groups the years based on prevailing global climatic condition using statistical clustering technique and subsequently for each such group it identifies significant climatic predictor variables which assist in better prediction. Prediction model is designed to frame individual cluster using random forest of regression tree. Prediction of aggregate and regional monsoon is attempted. Mean absolute error of 5.2% is obtained for forecasting aggregate Indian summer monsoon. Errors in predicting the regional monsoons are also comparable in comparison to the high variation of regional precipitation. Proposed joint-clustering based ensemble model is observed to be superior to existing monsoon prediction models and it also surpasses general nonclustering based prediction models.

  8. Coordinate-Based Clustering Method for Indoor Fingerprinting Localization in Dense Cluttered Environments

    Directory of Open Access Journals (Sweden)

    Wen Liu

    2016-12-01

    Full Text Available Indoor positioning technologies has boomed recently because of the growing commercial interest in indoor location-based service (ILBS. Due to the absence of satellite signal in Global Navigation Satellite System (GNSS, various technologies have been proposed for indoor applications. Among them, Wi-Fi fingerprinting has been attracting much interest from researchers because of its pervasive deployment, flexibility and robustness to dense cluttered indoor environments. One challenge, however, is the deployment of Access Points (AP, which would bring a significant influence on the system positioning accuracy. This paper concentrates on WLAN based fingerprinting indoor location by analyzing the AP deployment influence, and studying the advantages of coordinate-based clustering compared to traditional RSS-based clustering. A coordinate-based clustering method for indoor fingerprinting location, named Smallest-Enclosing-Circle-based (SEC, is then proposed aiming at reducing the positioning error lying in the AP deployment and improving robustness to dense cluttered environments. All measurements are conducted in indoor public areas, such as the National Center For the Performing Arts (as Test-bed 1 and the XiDan Joy City (Floors 1 and 2, as Test-bed 2, and results show that SEC clustering algorithm can improve system positioning accuracy by about 32.7% for Test-bed 1, 71.7% for Test-bed 2 Floor 1 and 73.7% for Test-bed 2 Floor 2 compared with traditional RSS-based clustering algorithms such as K-means.

  9. Coordinate-Based Clustering Method for Indoor Fingerprinting Localization in Dense Cluttered Environments.

    Science.gov (United States)

    Liu, Wen; Fu, Xiao; Deng, Zhongliang

    2016-12-02

    Indoor positioning technologies has boomed recently because of the growing commercial interest in indoor location-based service (ILBS). Due to the absence of satellite signal in Global Navigation Satellite System (GNSS), various technologies have been proposed for indoor applications. Among them, Wi-Fi fingerprinting has been attracting much interest from researchers because of its pervasive deployment, flexibility and robustness to dense cluttered indoor environments. One challenge, however, is the deployment of Access Points (AP), which would bring a significant influence on the system positioning accuracy. This paper concentrates on WLAN based fingerprinting indoor location by analyzing the AP deployment influence, and studying the advantages of coordinate-based clustering compared to traditional RSS-based clustering. A coordinate-based clustering method for indoor fingerprinting location, named Smallest-Enclosing-Circle-based (SEC), is then proposed aiming at reducing the positioning error lying in the AP deployment and improving robustness to dense cluttered environments. All measurements are conducted in indoor public areas, such as the National Center For the Performing Arts (as Test-bed 1) and the XiDan Joy City (Floors 1 and 2, as Test-bed 2), and results show that SEC clustering algorithm can improve system positioning accuracy by about 32.7% for Test-bed 1, 71.7% for Test-bed 2 Floor 1 and 73.7% for Test-bed 2 Floor 2 compared with traditional RSS-based clustering algorithms such as K-means.

  10. Comparison of Skin Moisturizer: Consumer-Based Brand Equity (CBBE Factors in Clusters Based on Consumer Ethnocentrism

    Directory of Open Access Journals (Sweden)

    Yossy Hanna Garlina

    2014-09-01

    Full Text Available This research aims to analyze relevant factors contributing to the four dimensions of consumer-based brand equity in skin moisturizer industry. It is then followed by the clustering of female consumers of skin moisturizer based on ethnocentrism and differentiating each cluster’s consumer-based brand equity dimensions towards a domestic skin moisturizer brand Mustika Ratu, skin moisturizer. Research used descriptive survey method analysis. Primary data was obtained through questionnaire distribution to 70 female respondents for factor analysis and 120 female respondents for cluster analysis and one way analysis of variance (ANOVA. This research employed factor analysis to obtain relevant factors contributing to the five dimensions of consumer-based brand equity in skin moisturizer industry. Cluster analysis and one way analysis of variance (ANOVA were to see the difference of consumer-based brand equity between highly ethnocentric consumer and low ethnocentric consumer towards the same skin moisturizer domestic brand, Mustika Ratu skin moisturizer. Research found in all individual dimension analysis, all variable means and individual means show distinct difference between the high ethnocentric consumer and the low ethnocentric consumer. The low ethnocentric consumer cluster tends to be lower in mean score of Brand Loyalty, Perceived Quality, Brand Awareness, Brand Association, and Overall Brand Equity than the high ethnocentric consumer cluster. Research concludes consumer ethnocentrism is positively correlated with preferences towards domestic products and negatively correlated with foreign-made product preference. It is, then, highly ethnocentric consumers have positive perception towards domestic product.

  11. FRCA: A Fuzzy Relevance-Based Cluster Head Selection Algorithm for Wireless Mobile Ad-Hoc Sensor Networks

    Directory of Open Access Journals (Sweden)

    Taegwon Jeong

    2011-05-01

    Full Text Available Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP, the Weighted-based Adaptive Clustering Algorithm (WACA, and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM. The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms.

  12. FRCA: a fuzzy relevance-based cluster head selection algorithm for wireless mobile ad-hoc sensor networks.

    Science.gov (United States)

    Lee, Chongdeuk; Jeong, Taegwon

    2011-01-01

    Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA) to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP), the Weighted-based Adaptive Clustering Algorithm (WACA), and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM). The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms.

  13. Cluster Decline and Resilience

    DEFF Research Database (Denmark)

    Østergaard, Christian Richter; Park, Eun Kyung

    Most studies on regional clusters focus on identifying factors and processes that make clusters grow. However, sometimes technologies and market conditions suddenly shift, and clusters decline. This paper analyses the process of decline of the wireless communication cluster in Denmark, 1963......-2011. Our longitudinal study reveals that technological lock-in and exit of key firms have contributed to impairment of the cluster’s resilience in adapting to disruptions. Entrepreneurship has a positive effect on cluster resilience, while multinational companies have contradicting effects by bringing...... in new resources to the cluster but being quick to withdraw in times of crisis....

  14. AES based secure low energy adaptive clustering hierarchy for WSNs

    Science.gov (United States)

    Kishore, K. R.; Sarma, N. V. S. N.

    2013-01-01

    Wireless sensor networks (WSNs) provide a low cost solution in diversified application areas. The wireless sensor nodes are inexpensive tiny devices with limited storage, computational capability and power. They are being deployed in large scale in both military and civilian applications. Security of the data is one of the key concerns where large numbers of nodes are deployed. Here, an energy-efficient secure routing protocol, secure-LEACH (Low Energy Adaptive Clustering Hierarchy) for WSNs based on the Advanced Encryption Standard (AES) is being proposed. This crypto system is a session based one and a new session key is assigned for each new session. The network (WSN) is divided into number of groups or clusters and a cluster head (CH) is selected among the member nodes of each cluster. The measured data from the nodes is aggregated by the respective CH's and then each CH relays this data to another CH towards the gateway node in the WSN which in turn sends the same to the Base station (BS). In order to maintain confidentiality of data while being transmitted, it is necessary to encrypt the data before sending at every hop, from a node to the CH and from the CH to another CH or to the gateway node.

  15. Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

    Directory of Open Access Journals (Sweden)

    Melissa Dsouza

    Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further

  16. The association between mood state and chronobiological characteristics in bipolar I disorder: a naturalistic, variable cluster analysis-based study.

    Science.gov (United States)

    Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres

    2018-02-19

    Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.

  17. Feature Selection and Kernel Learning for Local Learning-Based Clustering.

    Science.gov (United States)

    Zeng, Hong; Cheung, Yiu-ming

    2011-08-01

    The performance of the most clustering algorithms highly relies on the representation of data in the input space or the Hilbert space of kernel methods. This paper is to obtain an appropriate data representation through feature selection or kernel learning within the framework of the Local Learning-Based Clustering (LLC) (Wu and Schölkopf 2006) method, which can outperform the global learning-based ones when dealing with the high-dimensional data lying on manifold. Specifically, we associate a weight to each feature or kernel and incorporate it into the built-in regularization of the LLC algorithm to take into account the relevance of each feature or kernel for the clustering. Accordingly, the weights are estimated iteratively in the clustering process. We show that the resulting weighted regularization with an additional constraint on the weights is equivalent to a known sparse-promoting penalty. Hence, the weights of those irrelevant features or kernels can be shrunk toward zero. Extensive experiments show the efficacy of the proposed methods on the benchmark data sets.

  18. Galaxy clusters in the SDSS Stripe 82 based on photometric redshifts

    International Nuclear Information System (INIS)

    Durret, F.; Adami, C.; Bertin, E.; Hao, J.; Márquez, I.

    2015-01-01

    Based on a recent photometric redshift galaxy catalogue, we have searched for galaxy clusters in the Stripe ~82 region of the Sloan Digital Sky Survey by applying the Adami & MAzure Cluster FInder (AMACFI). Extensive tests were made to fine-tune the AMACFI parameters and make the cluster detection as reliable as possible. The same method was applied to the Millennium simulation to estimate our detection efficiency and the approximate masses of the detected clusters. Considering all the cluster galaxies (i.e. within a 1 Mpc radius of the cluster to which they belong and with a photoz differing by less than 0.05 from that of the cluster), we stacked clusters in various redshift bins to derive colour-magnitude diagrams and galaxy luminosity functions (GLFs). For each galaxy with absolute magnitude brighter than -19.0 in the r band, we computed the disk and spheroid components by applying SExtractor, and by stacking clusters we determined how the disk-to-spheroid flux ratio varies with cluster redshift and mass. We also detected 3663 clusters in the redshift range 0.15< z<0.70, with estimated mean masses between 10"1"3 and a few 10"1"4 solar masses. Furthermore, by stacking the cluster galaxies in various redshift bins, we find a clear red sequence in the (g'-r') versus r' colour-magnitude diagrams, and the GLFs are typical of clusters, though with a possible contamination from field galaxies. The morphological analysis of the cluster galaxies shows that the fraction of late-type to early-type galaxies shows an increase with redshift (particularly in high mass clusters) and a decrease with detection level, i.e. cluster mass. From the properties of the cluster galaxies, the majority of the candidate clusters detected here seem to be real clusters with typical cluster properties.

  19. Markov Chain Model-Based Optimal Cluster Heads Selection for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Gulnaz Ahmed

    2017-02-01

    Full Text Available The longer network lifetime of Wireless Sensor Networks (WSNs is a goal which is directly related to energy consumption. This energy consumption issue becomes more challenging when the energy load is not properly distributed in the sensing area. The hierarchal clustering architecture is the best choice for these kind of issues. In this paper, we introduce a novel clustering protocol called Markov chain model-based optimal cluster heads (MOCHs selection for WSNs. In our proposed model, we introduce a simple strategy for the optimal number of cluster heads selection to overcome the problem of uneven energy distribution in the network. The attractiveness of our model is that the BS controls the number of cluster heads while the cluster heads control the cluster members in each cluster in such a restricted manner that a uniform and even load is ensured in each cluster. We perform an extensive range of simulation using five quality measures, namely: the lifetime of the network, stable and unstable region in the lifetime of the network, throughput of the network, the number of cluster heads in the network, and the transmission time of the network to analyze the proposed model. We compare MOCHs against Sleep-awake Energy Efficient Distributed (SEED clustering, Artificial Bee Colony (ABC, Zone Based Routing (ZBR, and Centralized Energy Efficient Clustering (CEEC using the above-discussed quality metrics and found that the lifetime of the proposed model is almost 1095, 2630, 3599, and 2045 rounds (time steps greater than SEED, ABC, ZBR, and CEEC, respectively. The obtained results demonstrate that the MOCHs is better than SEED, ABC, ZBR, and CEEC in terms of energy efficiency and the network throughput.

  20. Cluster-based localization and tracking in ubiquitous computing systems

    CERN Document Server

    Martínez-de Dios, José Ramiro; Torres-González, Arturo; Ollero, Anibal

    2017-01-01

    Localization and tracking are key functionalities in ubiquitous computing systems and techniques. In recent years a very high variety of approaches, sensors and techniques for indoor and GPS-denied environments have been developed. This book briefly summarizes the current state of the art in localization and tracking in ubiquitous computing systems focusing on cluster-based schemes. Additionally, existing techniques for measurement integration, node inclusion/exclusion and cluster head selection are also described in this book.

  1. Support Policies in Clusters: Prioritization of Support Needs by Cluster Members According to Cluster Life Cycle

    Directory of Open Access Journals (Sweden)

    Gulcin Salıngan

    2012-07-01

    Full Text Available Economic development has always been a moving target. Both the national and local governments have been facing the challenge of implementing the effective and efficient economic policy and program in order to best utilize their limited resources. One of the recent approaches in this area is called cluster-based economic analysis and strategy development. This study reviews key literature and some of the cluster based economic policies adopted by different governments. Based on this review, it proposes “the cluster life cycle” as a determining factor to identify the support requirements of clusters. A survey, designed based on literature review of International Cluster support programs, was conducted with 30 participants from 3 clusters with different maturity stage. This paper discusses the results of this study conducted among the cluster members in Eskişehir- Bilecik-Kütahya Region in Turkey on the requirement of the support to foster the development of related clusters.

  2. Clustering-based analysis for residential district heating data

    DEFF Research Database (Denmark)

    Gianniou, Panagiota; Liu, Xiufeng; Heller, Alfred

    2018-01-01

    The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze r....... These findings will be valuable for district heating utilities and energy planners to optimize their operations, design demand-side management strategies, and develop targeting energy-efficiency programs or policies.......The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze...... residential heating consumption data and evaluate information included in national building databases. The proposed method uses the K-means algorithm to segment consumption groups based on consumption intensity and representative patterns and ranks the groups according to daily consumption. This paper also...

  3. Effective Social Relationship Measurement and Cluster Based Routing in Mobile Opportunistic Networks.

    Science.gov (United States)

    Zeng, Feng; Zhao, Nan; Li, Wenjia

    2017-05-12

    In mobile opportunistic networks, the social relationship among nodes has an important impact on data transmission efficiency. Motivated by the strong share ability of "circles of friends" in communication networks such as Facebook, Twitter, Wechat and so on, we take a real-life example to show that social relationships among nodes consist of explicit and implicit parts. The explicit part comes from direct contact among nodes, and the implicit part can be measured through the "circles of friends". We present the definitions of explicit and implicit social relationships between two nodes, adaptive weights of explicit and implicit parts are given according to the contact feature of nodes, and the distributed mechanism is designed to construct the "circles of friends" of nodes, which is used for the calculation of the implicit part of social relationship between nodes. Based on effective measurement of social relationships, we propose a social-based clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, and the self-control method is used to keep all cluster members always having close relationships with each other. A cluster-based message forwarding mechanism is designed for opportunistic routing, in which each node only forwards the copy of the message to nodes with the destination node as a member of the local cluster. Simulation results show that the proposed social-based clustering and routing outperforms the other classic routing algorithms.

  4. A semantics-based method for clustering of Chinese web search results

    Science.gov (United States)

    Zhang, Hui; Wang, Deqing; Wang, Li; Bi, Zhuming; Chen, Yong

    2014-01-01

    Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable information, however, is still likely submerged in the ocean of search results from those tools. By clustering the results into different groups based on subjects automatically, a search engine with the clustering feature allows users to select most relevant results quickly. In this paper, we propose an online semantics-based method to cluster Chinese web search results. First, we employ the generalised suffix tree to extract the longest common substrings (LCSs) from search snippets. Second, we use the HowNet to calculate the similarities of the words derived from the LCSs, and extract the most representative features by constructing the vocabulary chain. Third, we construct a vector of text features and calculate snippets' semantic similarities. Finally, we improve the Chameleon algorithm to cluster snippets. Extensive experimental results have shown that the proposed algorithm has outperformed over the suffix tree clustering method and other traditional clustering methods.

  5. A clustering approach to segmenting users of internet-based risk calculators.

    Science.gov (United States)

    Harle, C A; Downs, J S; Padman, R

    2011-01-01

    Risk calculators are widely available Internet applications that deliver quantitative health risk estimates to consumers. Although these tools are known to have varying effects on risk perceptions, little is known about who will be more likely to accept objective risk estimates. To identify clusters of online health consumers that help explain variation in individual improvement in risk perceptions from web-based quantitative disease risk information. A secondary analysis was performed on data collected in a field experiment that measured people's pre-diabetes risk perceptions before and after visiting a realistic health promotion website that provided quantitative risk information. K-means clustering was performed on numerous candidate variable sets, and the different segmentations were evaluated based on between-cluster variation in risk perception improvement. Variation in responses to risk information was best explained by clustering on pre-intervention absolute pre-diabetes risk perceptions and an objective estimate of personal risk. Members of a high-risk overestimater cluster showed large improvements in their risk perceptions, but clusters of both moderate-risk and high-risk underestimaters were much more muted in improving their optimistically biased perceptions. Cluster analysis provided a unique approach for segmenting health consumers and predicting their acceptance of quantitative disease risk information. These clusters suggest that health consumers were very responsive to good news, but tended not to incorporate bad news into their self-perceptions much. These findings help to quantify variation among online health consumers and may inform the targeted marketing of and improvements to risk communication tools on the Internet.

  6. On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio

    Directory of Open Access Journals (Sweden)

    Tatjana Miljkovic

    2018-05-01

    Full Text Available We review two complementary mixture-based clustering approaches for modeling unobserved heterogeneity in an insurance portfolio: the generalized linear mixed cluster-weighted model (CWM and mixture-based clustering for an ordered stereotype model (OSM. The latter is for modeling of ordinal variables, and the former is for modeling losses as a function of mixed-type of covariates. The article extends the idea of mixture modeling to a multivariate classification for the purpose of testing unobserved heterogeneity in an insurance portfolio. The application of both methods is illustrated on a well-known French automobile portfolio, in which the model fitting is performed using the expectation-maximization (EM algorithm. Our findings show that these mixture-based clustering methods can be used to further test unobserved heterogeneity in an insurance portfolio and as such may be considered in insurance pricing, underwriting, and risk management.

  7. A novel artificial bee colony based clustering algorithm for categorical data.

    Science.gov (United States)

    Ji, Jinchao; Pang, Wei; Zheng, Yanlin; Wang, Zhe; Ma, Zhiqiang

    2015-01-01

    Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data.

  8. A cluster randomized trial in general practice with referral to a group-based or an Internet-based smoking cessation programme

    DEFF Research Database (Denmark)

    Pisinger, Charlotta; Jørgensen, Michael Milo; Møller, Niels Erik

    2010-01-01

    randomized to one of three groups: Group A, referral to group-based SC counselling (national model), n = 10; Group B, referral to internet-based SC programme (newly developed), n = 8; or Group C, no referral ('do as usual'), n = 6. A total of 1518/1914 smokers were included, and 760 returned a questionnaire...... at 1-year follow-up. RESULTS: The participating GPs reported significantly more SC counselling than GPs who refused participation (P = 0.04). Self-reported point abstinence was 6.7% (40/600), 5.9% (28/476) and 5.7% (25/442) in Groups A, B and C, respectively. Only 40 smokers attended group-based SC...... counselling, and 75 logged in at the internet-based SC programme. In cluster analyses, we found no significant additional effect of referral to group-based (OR: 1.05; 95% CI: 0.6-1.8) or internet-based SC programmes (OR: 0.91; 95% CI: 0.6-1.4). CONCLUSIONS: We found no additional effect on cessation rates...

  9. Beverages-Food Industry Cluster Development Based on Value Chain in Indonesia

    OpenAIRE

    Lasmono Tri Sunaryanto; Gatot Sasongko; Ira Yumastuti

    2014-01-01

    This study wants to develop the cluster-based food and beverage industry value chain that corresponds to the potential in the regions in Java Economic Corridor. Targeted research: a description of SME development strategies that have been implemented, composed, and can be applied to an SME cluster development strategy of food and beverage, as well as a proven implementation strategy of SME cluster development of food and beverage. To achieve these objectives, implemented descriptive methods, ...

  10. DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

    Science.gov (United States)

    Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei

    2018-01-01

    Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author

  11. Event-based cluster synchronization of coupled genetic regulatory networks

    Science.gov (United States)

    Yue, Dandan; Guan, Zhi-Hong; Li, Tao; Liao, Rui-Quan; Liu, Feng; Lai, Qiang

    2017-09-01

    In this paper, the cluster synchronization of coupled genetic regulatory networks with a directed topology is studied by using the event-based strategy and pinning control. An event-triggered condition with a threshold consisting of the neighbors' discrete states at their own event time instants and a state-independent exponential decay function is proposed. The intra-cluster states information and extra-cluster states information are involved in the threshold in different ways. By using the Lyapunov function approach and the theories of matrices and inequalities, we establish the cluster synchronization criterion. It is shown that both the avoidance of continuous transmission of information and the exclusion of the Zeno behavior are ensured under the presented triggering condition. Explicit conditions on the parameters in the threshold are obtained for synchronization. The stability criterion of a single GRN is also given under the reduced triggering condition. Numerical examples are provided to validate the theoretical results.

  12. Developing a Clustering-Based Empirical Bayes Analysis Method for Hotspot Identification

    Directory of Open Access Journals (Sweden)

    Yajie Zou

    2017-01-01

    Full Text Available Hotspot identification (HSID is a critical part of network-wide safety evaluations. Typical methods for ranking sites are often rooted in using the Empirical Bayes (EB method to estimate safety from both observed crash records and predicted crash frequency based on similar sites. The performance of the EB method is highly related to the selection of a reference group of sites (i.e., roadway segments or intersections similar to the target site from which safety performance functions (SPF used to predict crash frequency will be developed. As crash data often contain underlying heterogeneity that, in essence, can make them appear to be generated from distinct subpopulations, methods are needed to select similar sites in a principled manner. To overcome this possible heterogeneity problem, EB-based HSID methods that use common clustering methodologies (e.g., mixture models, K-means, and hierarchical clustering to select “similar” sites for building SPFs are developed. Performance of the clustering-based EB methods is then compared using real crash data. Here, HSID results, when computed on Texas undivided rural highway cash data, suggest that all three clustering-based EB analysis methods are preferred over the conventional statistical methods. Thus, properly classifying the road segments for heterogeneous crash data can further improve HSID accuracy.

  13. A Cluster-Based Fuzzy Fusion Algorithm for Event Detection in Heterogeneous Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    ZiQi Hao

    2015-01-01

    Full Text Available As limited energy is one of the tough challenges in wireless sensor networks (WSN, energy saving becomes important in increasing the lifecycle of the network. Data fusion enables combining information from several sources thus to provide a unified scenario, which can significantly save sensor energy and enhance sensing data accuracy. In this paper, we propose a cluster-based data fusion algorithm for event detection. We use k-means algorithm to form the nodes into clusters, which can significantly reduce the energy consumption of intracluster communication. Distances between cluster heads and event and energy of clusters are fuzzified, thus to use a fuzzy logic to select the clusters that will participate in data uploading and fusion. Fuzzy logic method is also used by cluster heads for local decision, and then the local decision results are sent to the base station. Decision-level fusion for final decision of event is performed by base station according to the uploaded local decisions and fusion support degree of clusters calculated by fuzzy logic method. The effectiveness of this algorithm is demonstrated by simulation results.

  14. Orbit Clustering Based on Transfer Cost

    Science.gov (United States)

    Gustafson, Eric D.; Arrieta-Camacho, Juan J.; Petropoulos, Anastassios E.

    2013-01-01

    We propose using cluster analysis to perform quick screening for combinatorial global optimization problems. The key missing component currently preventing cluster analysis from use in this context is the lack of a useable metric function that defines the cost to transfer between two orbits. We study several proposed metrics and clustering algorithms, including k-means and the expectation maximization algorithm. We also show that proven heuristic methods such as the Q-law can be modified to work with cluster analysis.

  15. Regional SAR Image Segmentation Based on Fuzzy Clustering with Gamma Mixture Model

    Science.gov (United States)

    Li, X. L.; Zhao, Q. H.; Li, Y.

    2017-09-01

    Most of stochastic based fuzzy clustering algorithms are pixel-based, which can not effectively overcome the inherent speckle noise in SAR images. In order to deal with the problem, a regional SAR image segmentation algorithm based on fuzzy clustering with Gamma mixture model is proposed in this paper. First, initialize some generating points randomly on the image, the image domain is divided into many sub-regions using Voronoi tessellation technique. Each sub-region is regarded as a homogeneous area in which the pixels share the same cluster label. Then, assume the probability of the pixel to be a Gamma mixture model with the parameters respecting to the cluster which the pixel belongs to. The negative logarithm of the probability represents the dissimilarity measure between the pixel and the cluster. The regional dissimilarity measure of one sub-region is defined as the sum of the measures of pixels in the region. Furthermore, the Markov Random Field (MRF) model is extended from pixels level to Voronoi sub-regions, and then the regional objective function is established under the framework of fuzzy clustering. The optimal segmentation results can be obtained by the solution of model parameters and generating points. Finally, the effectiveness of the proposed algorithm can be proved by the qualitative and quantitative analysis from the segmentation results of the simulated and real SAR images.

  16. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    Science.gov (United States)

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  17. Scalable Integrated Region-Based Image Retrieval Using IRM and Statistical Clustering.

    Science.gov (United States)

    Wang, James Z.; Du, Yanping

    Statistical clustering is critical in designing scalable image retrieval systems. This paper presents a scalable algorithm for indexing and retrieving images based on region segmentation. The method uses statistical clustering on region features and IRM (Integrated Region Matching), a measure developed to evaluate overall similarity between images…

  18. Cluster Validity Classification Approaches Based on Geometric Probability and Application in the Classification of Remotely Sensed Images

    Directory of Open Access Journals (Sweden)

    LI Jian-Wei

    2014-08-01

    Full Text Available On the basis of the cluster validity function based on geometric probability in literature [1, 2], propose a cluster analysis method based on geometric probability to process large amount of data in rectangular area. The basic idea is top-down stepwise refinement, firstly categories then subcategories. On all clustering levels, use the cluster validity function based on geometric probability firstly, determine clusters and the gathering direction, then determine the center of clustering and the border of clusters. Through TM remote sensing image classification examples, compare with the supervision and unsupervised classification in ERDAS and the cluster analysis method based on geometric probability in two-dimensional square which is proposed in literature 2. Results show that the proposed method can significantly improve the classification accuracy.

  19. A Survey on the Taxonomy of Cluster-Based Routing Protocols for Homogeneous Wireless Sensor Networks

    Science.gov (United States)

    Naeimi, Soroush; Ghafghazi, Hamidreza; Chow, Chee-Onn; Ishii, Hiroshi

    2012-01-01

    The past few years have witnessed increased interest among researchers in cluster-based protocols for homogeneous networks because of their better scalability and higher energy efficiency than other routing protocols. Given the limited capabilities of sensor nodes in terms of energy resources, processing and communication range, the cluster-based protocols should be compatible with these constraints in either the setup state or steady data transmission state. With focus on these constraints, we classify routing protocols according to their objectives and methods towards addressing the shortcomings of clustering process on each stage of cluster head selection, cluster formation, data aggregation and data communication. We summarize the techniques and methods used in these categories, while the weakness and strength of each protocol is pointed out in details. Furthermore, taxonomy of the protocols in each phase is given to provide a deeper understanding of current clustering approaches. Ultimately based on the existing research, a summary of the issues and solutions of the attributes and characteristics of clustering approaches and some open research areas in cluster-based routing protocols that can be further pursued are provided. PMID:22969350

  20. A novel grain cluster-based homogenization scheme

    International Nuclear Information System (INIS)

    Tjahjanto, D D; Eisenlohr, P; Roters, F

    2010-01-01

    An efficient homogenization scheme, termed the relaxed grain cluster (RGC), for elasto-plastic deformations of polycrystals is presented. The scheme is based on a generalization of the grain cluster concept. A volume element consisting of eight (= 2 × 2 × 2) hexahedral grains is considered. The kinematics of the RGC scheme is formulated within a finite deformation framework, where the relaxation of the local deformation gradient of each individual grain is connected to the overall deformation gradient by the, so-called, interface relaxation vectors. The set of relaxation vectors is determined by the minimization of the constitutive energy (or work) density of the overall cluster. An additional energy density associated with the mismatch at the grain boundaries due to relaxations is incorporated as a penalty term into the energy minimization formulation. Effectively, this penalty term represents the kinematical condition of deformation compatibility at the grain boundaries. Simulations have been performed for a dual-phase grain cluster loaded in uniaxial tension. The results of the simulations are presented and discussed in terms of the effective stress–strain response and the overall deformation anisotropy as functions of the penalty energy parameters. In addition, the prediction of the RGC scheme is compared with predictions using other averaging schemes, as well as to the result of direct finite element (FE) simulation. The comparison indicates that the present RGC scheme is able to approximate FE simulation results of relatively fine discretization at about three orders of magnitude lower computational cost

  1. Local bladder cancer clusters in southeastern Michigan accounting for risk factors, covariates and residential mobility.

    Directory of Open Access Journals (Sweden)

    Geoffrey M Jacquez

    Full Text Available In case control studies disease risk not explained by the significant risk factors is the unexplained risk. Considering unexplained risk for specific populations, places and times can reveal the signature of unidentified risk factors and risk factors not fully accounted for in the case-control study. This potentially can lead to new hypotheses regarding disease causation.Global, local and focused Q-statistics are applied to data from a population-based case-control study of 11 southeast Michigan counties. Analyses were conducted using both year- and age-based measures of time. The analyses were adjusted for arsenic exposure, education, smoking, family history of bladder cancer, occupational exposure to bladder cancer carcinogens, age, gender, and race.Significant global clustering of cases was not found. Such a finding would indicate large-scale clustering of cases relative to controls through time. However, highly significant local clusters were found in Ingham County near Lansing, in Oakland County, and in the City of Jackson, Michigan. The Jackson City cluster was observed in working-ages and is thus consistent with occupational causes. The Ingham County cluster persists over time, suggesting a broad-based geographically defined exposure. Focused clusters were found for 20 industrial sites engaged in manufacturing activities associated with known or suspected bladder cancer carcinogens. Set-based tests that adjusted for multiple testing were not significant, although local clusters persisted through time and temporal trends in probability of local tests were observed.Q analyses provide a powerful tool for unpacking unexplained disease risk from case-control studies. This is particularly useful when the effect of risk factors varies spatially, through time, or through both space and time. For bladder cancer in Michigan, the next step is to investigate causal hypotheses that may explain the excess bladder cancer risk localized to areas of

  2. a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data

    Science.gov (United States)

    Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.

    2017-09-01

    Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.

  3. Accurate recapture identification for genetic mark–recapture studies with error-tolerant likelihood-based match calling and sample clustering

    Science.gov (United States)

    Sethi, Suresh; Linden, Daniel; Wenburg, John; Lewis, Cara; Lemons, Patrick R.; Fuller, Angela K.; Hare, Matthew P.

    2016-01-01

    Error-tolerant likelihood-based match calling presents a promising technique to accurately identify recapture events in genetic mark–recapture studies by combining probabilities of latent genotypes and probabilities of observed genotypes, which may contain genotyping errors. Combined with clustering algorithms to group samples into sets of recaptures based upon pairwise match calls, these tools can be used to reconstruct accurate capture histories for mark–recapture modelling. Here, we assess the performance of a recently introduced error-tolerant likelihood-based match-calling model and sample clustering algorithm for genetic mark–recapture studies. We assessed both biallelic (i.e. single nucleotide polymorphisms; SNP) and multiallelic (i.e. microsatellite; MSAT) markers using a combination of simulation analyses and case study data on Pacific walrus (Odobenus rosmarus divergens) and fishers (Pekania pennanti). A novel two-stage clustering approach is demonstrated for genetic mark–recapture applications. First, repeat captures within a sampling occasion are identified. Subsequently, recaptures across sampling occasions are identified. The likelihood-based matching protocol performed well in simulation trials, demonstrating utility for use in a wide range of genetic mark–recapture studies. Moderately sized SNP (64+) and MSAT (10–15) panels produced accurate match calls for recaptures and accurate non-match calls for samples from closely related individuals in the face of low to moderate genotyping error. Furthermore, matching performance remained stable or increased as the number of genetic markers increased, genotyping error notwithstanding.

  4. Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods.

    Science.gov (United States)

    Šubelj, Lovro; van Eck, Nees Jan; Waltman, Ludo

    2016-01-01

    Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community.

  5. Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods

    Science.gov (United States)

    Šubelj, Lovro; van Eck, Nees Jan; Waltman, Ludo

    2016-01-01

    Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community. PMID:27124610

  6. An ant colony based resilience approach to cascading failures in cluster supply network

    Science.gov (United States)

    Wang, Yingcong; Xiao, Renbin

    2016-11-01

    Cluster supply chain network is a typical complex network and easily suffers cascading failures under disruption events, which is caused by the under-load of enterprises. Improving network resilience can increase the ability of recovery from cascading failures. Social resilience is found in ant colony and comes from ant's spatial fidelity zones (SFZ). Starting from the under-load failures, this paper proposes a resilience method to cascading failures in cluster supply chain network by leveraging on social resilience of ant colony. First, the mapping between ant colony SFZ and cluster supply chain network SFZ is presented. Second, a new cascading model for cluster supply chain network is constructed based on under-load failures. Then, the SFZ-based resilience method and index to cascading failures are developed according to ant colony's social resilience. Finally, a numerical simulation and a case study are used to verify the validity of the cascading model and the resilience method. Experimental results show that, the cluster supply chain network becomes resilient to cascading failures under the SFZ-based resilience method, and the cluster supply chain network resilience can be enhanced by improving the ability of enterprises to recover and adjust.

  7. Semi-supervised weighted kernel clustering based on gravitational search for fault diagnosis.

    Science.gov (United States)

    Li, Chaoshun; Zhou, Jianzhong

    2014-09-01

    Supervised learning method, like support vector machine (SVM), has been widely applied in diagnosing known faults, however this kind of method fails to work correctly when new or unknown fault occurs. Traditional unsupervised kernel clustering can be used for unknown fault diagnosis, but it could not make use of the historical classification information to improve diagnosis accuracy. In this paper, a semi-supervised kernel clustering model is designed to diagnose known and unknown faults. At first, a novel semi-supervised weighted kernel clustering algorithm based on gravitational search (SWKC-GS) is proposed for clustering of dataset composed of labeled and unlabeled fault samples. The clustering model of SWKC-GS is defined based on wrong classification rate of labeled samples and fuzzy clustering index on the whole dataset. Gravitational search algorithm (GSA) is used to solve the clustering model, while centers of clusters, feature weights and parameter of kernel function are selected as optimization variables. And then, new fault samples are identified and diagnosed by calculating the weighted kernel distance between them and the fault cluster centers. If the fault samples are unknown, they will be added in historical dataset and the SWKC-GS is used to partition the mixed dataset and update the clustering results for diagnosing new fault. In experiments, the proposed method has been applied in fault diagnosis for rotatory bearing, while SWKC-GS has been compared not only with traditional clustering methods, but also with SVM and neural network, for known fault diagnosis. In addition, the proposed method has also been applied in unknown fault diagnosis. The results have shown effectiveness of the proposed method in achieving expected diagnosis accuracy for both known and unknown faults of rotatory bearing. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  8. Large-Scale Multi-Dimensional Document Clustering on GPU Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Mueller, Frank [North Carolina State University; Zhang, Yongpeng [ORNL; Potok, Thomas E [ORNL

    2010-01-01

    Document clustering plays an important role in data mining systems. Recently, a flocking-based document clustering algorithm has been proposed to solve the problem through simulation resembling the flocking behavior of birds in nature. This method is superior to other clustering algorithms, including k-means, in the sense that the outcome is not sensitive to the initial state. One limitation of this approach is that the algorithmic complexity is inherently quadratic in the number of documents. As a result, execution time becomes a bottleneck with large number of documents. In this paper, we assess the benefits of exploiting the computational power of Beowulf-like clusters equipped with contemporary Graphics Processing Units (GPUs) as a means to significantly reduce the runtime of flocking-based document clustering. Our framework scales up to over one million documents processed simultaneously in a sixteennode GPU cluster. Results are also compared to a four-node cluster with higher-end GPUs. On these clusters, we observe 30X-50X speedups, which demonstrates the potential of GPU clusters to efficiently solve massive data mining problems. Such speedups combined with the scalability potential and accelerator-based parallelization are unique in the domain of document-based data mining, to the best of our knowledge.

  9. A time-series approach for clustering farms based on slaughterhouse health aberration data.

    Science.gov (United States)

    Hulsegge, B; de Greef, K H

    2018-05-01

    A large amount of data is collected routinely in meat inspection in pig slaughterhouses. A time series clustering approach is presented and applied that groups farms based on similar statistical characteristics of meat inspection data over time. A three step characteristic-based clustering approach was used from the idea that the data contain more info than the incidence figures. A stratified subset containing 511,645 pigs was derived as a study set from 3.5 years of meat inspection data. The monthly averages of incidence of pleuritis and of pneumonia of 44 Dutch farms (delivering 5149 batches to 2 pig slaughterhouses) were subjected to 1) derivation of farm level data characteristics 2) factor analysis and 3) clustering into groups of farms. The characteristic-based clustering was able to cluster farms for both lung aberrations. Three groups of data characteristics were informative, describing incidence, time pattern and degree of autocorrelation. The consistency of clustering similar farms was confirmed by repetition of the analysis in a larger dataset. The robustness of the clustering was tested on a substantially extended dataset. This confirmed the earlier results, three data distribution aspects make up the majority of distinction between groups of farms and in these groups (clusters) the majority of the farms was allocated comparable to the earlier allocation (75% and 62% for pleuritis and pneumonia, respectively). The difference between pleuritis and pneumonia in their seasonal dependency was confirmed, supporting the biological relevance of the clustering. Comparison of the identified clusters of statistically comparable farms can be used to detect farm level risk factors causing the health aberrations beyond comparison on disease incidence and trend alone. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Photometric analyses of abundances in dwarf spheroidal galaxies and globular clusters

    International Nuclear Information System (INIS)

    Light, R.M.

    1988-01-01

    This study investigated the abundance characteristics of several dwarf spheroidal galaxies. The chemical properties of stars in these galaxies are tracers of the origin and evolution of their stellar populations, and thus can provide constraints on the theories of their formation. To derive this abundance information, photometric observations of stars in a sample of globular clusters, covering a large range in metallicity, were analyzed. Parameters describing the position of the red giant branch were found to correlate very well with cluster metallicity over a large range in abundance. These measurements, made in the Thuan-Gunn photometry system, provide ranking schemes which are, with accurate photometry, more sensitive to changes in metallicity than similar broadband BV parameters. The relations were used to derive an improved estimate of the metallicity of cluster NGC 5053. These metallicity relations were used to analyze the Thuan-Gunn system photometry produced for the Sculptor, Ursa Minor, and Carina galaxies. The excellent agreement between their metallicities and those from other previous studies indicates that globular cluster red giant branch parameters are very useful in ranking dwarf spheroidal populations by metallicity. Together with other galaxian data, strong correlations can be seen between the mean metallicities and dispersions in metallicity and the luminosities of the dwarf spheroidal galaxies. These trends also seem to apply to members of the dwarf elliptical class of galaxies. The ramifications that these correlations and the existence of a metallicity gradient in Sculptor have on the formation of the dwarf spheroidals are discussed

  11. Seniority-based coupled cluster theory

    International Nuclear Information System (INIS)

    Henderson, Thomas M.; Scuseria, Gustavo E.; Bulik, Ireneusz W.; Stein, Tamar

    2014-01-01

    Doubly occupied configuration interaction (DOCI) with optimized orbitals often accurately describes strong correlations while working in a Hilbert space much smaller than that needed for full configuration interaction. However, the scaling of such calculations remains combinatorial with system size. Pair coupled cluster doubles (pCCD) is very successful in reproducing DOCI energetically, but can do so with low polynomial scaling (N 3 , disregarding the two-electron integral transformation from atomic to molecular orbitals). We show here several examples illustrating the success of pCCD in reproducing both the DOCI energy and wave function and show how this success frequently comes about. What DOCI and pCCD lack are an effective treatment of dynamic correlations, which we here add by including higher-seniority cluster amplitudes which are excluded from pCCD. This frozen pair coupled cluster approach is comparable in cost to traditional closed-shell coupled cluster methods with results that are competitive for weakly correlated systems and often superior for the description of strongly correlated systems

  12. Effects of the X:IT smoking intervention: a school-based cluster randomized trial.

    Science.gov (United States)

    Andersen, Anette; Krølner, Rikker; Bast, Lotus Sofie; Thygesen, Lau Caspar; Due, Pernille

    2015-12-01

    Uptake of smoking in adolescence is still of major public health concern. Evaluations of school-based programmes for smoking prevention show mixed results. The aim of this study was to examine the effect of X:IT, a multi-component school-based programme to prevent adolescent smoking. Data from a Danish cluster randomized trial included 4041 year-7 students (mean age: 12.5) from 51 intervention and 43 control schools. Outcome measure 'current smoking' was dichotomized into smoking daily, weekly, monthly or more seldom vs do not smoke. Analyses were adjusted for baseline covariates: sex, family socioeconomic position (SEP), best friend's smoking and parental smoking. We performed multilevel, logistic regression analyses of available cases and intention-to-treat (ITT) analyses, replacing missing outcome values by multiple imputation. At baseline, 4.7% and 6.8% of the students at the intervention and the control schools smoked, respectively. After 1 year of the intervention, the prevalence was 7.9% and 10.7%, respectively. At follow-up, 553 students (13.7%) did not answer the question on smoking. Available case analyses: crude odds ratios (OR) for smoking at intervention schools compared with control schools: 0.65 (0.48-0.88) and adjusted: 0.70 (0.47-1.04). ITT analyses: crude OR for smoking at intervention schools compared with control schools: 0.67 (0.50-0.89) and adjusted: 0.61 (0.45-0.82). Students at intervention schools had a lower risk of smoking after a year of intervention in year 7. This multi-component intervention involving educational, parental and context-related intervention components seems to be efficient in lowering or postponing smoking uptake in Danish adolescents. © The Author 2015; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.

  13. A multi-criteria evaluation system for marine litter pollution based on statistical analyses of OSPAR beach litter monitoring time series.

    Science.gov (United States)

    Schulz, Marcus; Neumann, Daniel; Fleet, David M; Matthies, Michael

    2013-12-01

    During the last decades, marine pollution with anthropogenic litter has become a worldwide major environmental concern. Standardized monitoring of litter since 2001 on 78 beaches selected within the framework of the Convention for the Protection of the Marine Environment of the North-East Atlantic (OSPAR) has been used to identify temporal trends of marine litter. Based on statistical analyses of this dataset a two-part multi-criteria evaluation system for beach litter pollution of the North-East Atlantic and the North Sea is proposed. Canonical correlation analyses, linear regression analyses, and non-parametric analyses of variance were used to identify different temporal trends. A classification of beaches was derived from cluster analyses and served to define different states of beach quality according to abundances of 17 input variables. The evaluation system is easily applicable and relies on the above-mentioned classification and on significant temporal trends implied by significant rank correlations. Copyright © 2013 Elsevier Ltd. All rights reserved.

  14. Clustering consumers based on trust, confidence and giving behaviour: data-driven model building for charitable involvement in the Australian not-for-profit sector.

    Science.gov (United States)

    de Vries, Natalie Jane; Reis, Rodrigo; Moscato, Pablo

    2015-01-01

    Organisations in the Not-for-Profit and charity sector face increasing competition to win time, money and efforts from a common donor base. Consequently, these organisations need to be more proactive than ever. The increased level of communications between individuals and organisations today, heightens the need for investigating the drivers of charitable giving and understanding the various consumer groups, or donor segments, within a population. It is contended that `trust' is the cornerstone of the not-for-profit sector's survival, making it an inevitable topic for research in this context. It has become imperative for charities and not-for-profit organisations to adopt for-profit's research, marketing and targeting strategies. This study provides the not-for-profit sector with an easily-interpretable segmentation method based on a novel unsupervised clustering technique (MST-kNN) followed by a feature saliency method (the CM1 score). A sample of 1,562 respondents from a survey conducted by the Australian Charities and Not-for-profits Commission is analysed to reveal donor segments. Each cluster's most salient features are identified using the CM1 score. Furthermore, symbolic regression modelling is employed to find cluster-specific models to predict `low' or `high' involvement in clusters. The MST-kNN method found seven clusters. Based on their salient features they were labelled as: the `non-institutionalist charities supporters', the `resource allocation critics', the `information-seeking financial sceptics', the `non-questioning charity supporters', the `non-trusting sceptics', the `charity management believers' and the `institutionalist charity believers'. Each cluster exhibits their own characteristics as well as different drivers of `involvement'. The method in this study provides the not-for-profit sector with a guideline for clustering, segmenting, understanding and potentially targeting their donor base better. If charities and not

  15. Clustering consumers based on trust, confidence and giving behaviour: data-driven model building for charitable involvement in the Australian not-for-profit sector.

    Directory of Open Access Journals (Sweden)

    Natalie Jane de Vries

    Full Text Available Organisations in the Not-for-Profit and charity sector face increasing competition to win time, money and efforts from a common donor base. Consequently, these organisations need to be more proactive than ever. The increased level of communications between individuals and organisations today, heightens the need for investigating the drivers of charitable giving and understanding the various consumer groups, or donor segments, within a population. It is contended that `trust' is the cornerstone of the not-for-profit sector's survival, making it an inevitable topic for research in this context. It has become imperative for charities and not-for-profit organisations to adopt for-profit's research, marketing and targeting strategies. This study provides the not-for-profit sector with an easily-interpretable segmentation method based on a novel unsupervised clustering technique (MST-kNN followed by a feature saliency method (the CM1 score. A sample of 1,562 respondents from a survey conducted by the Australian Charities and Not-for-profits Commission is analysed to reveal donor segments. Each cluster's most salient features are identified using the CM1 score. Furthermore, symbolic regression modelling is employed to find cluster-specific models to predict `low' or `high' involvement in clusters. The MST-kNN method found seven clusters. Based on their salient features they were labelled as: the `non-institutionalist charities supporters', the `resource allocation critics', the `information-seeking financial sceptics', the `non-questioning charity supporters', the `non-trusting sceptics', the `charity management believers' and the `institutionalist charity believers'. Each cluster exhibits their own characteristics as well as different drivers of `involvement'. The method in this study provides the not-for-profit sector with a guideline for clustering, segmenting, understanding and potentially targeting their donor base better. If charities and not

  16. Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions.

    Science.gov (United States)

    Tokuda, Tomoki; Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.

  17. Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions.

    Directory of Open Access Journals (Sweden)

    Tomoki Tokuda

    Full Text Available We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.

  18. Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

    Science.gov (United States)

    Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392

  19. An effective trust-based recommendation method using a novel graph clustering algorithm

    Science.gov (United States)

    Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin

    2015-10-01

    Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.

  20. The ergot alkaloid gene cluster: Functional analyses and evolutionary aspects

    Czech Academy of Sciences Publication Activity Database

    Lorenz, N.; Haarmann, T.; Pažoutová, Sylvie; Jung, M.; Tudzynski, P.

    2009-01-01

    Roč. 70, 15-16 (2009), s. 1822-1832 ISSN 0031-9422 Institutional research plan: CEZ:AV0Z50200510 Keywords : Claviceps purpurea * Ergot fungus * Ergot alkaloid gene cluster Subject RIV: EE - Microbiology, Virology Impact factor: 3.104, year: 2009

  1. Cluster chain based energy efficient routing protocol for moblie WSN

    Directory of Open Access Journals (Sweden)

    WU Ziyu

    2016-04-01

    Full Text Available With the ubiquitous smart devices acting as mobile sensor nodes in the wireless sensor networks(WSNs to sense and transmit physical information,routing protocols should be designed to accommodate the mobility issues,in addition to conventional considerations on energy efficiency.However,due to frequent topology change,traditional routing schemes cannot perform well.Moreover,existence of mobile nodes poses new challenges on energy dissipation and packet loss.In this paper,a novel routing scheme called cluster chain based routing protocol(CCBRP is proposed,which employs a combination of cluster and chain structure to accomplish data collection and transmission and thereafter selects qualified cluster heads as chain leaders to transmit data to the sink.Furthermore,node mobility is handled based on periodical membership update of mobile nodes.Simulation results demonstrate that CCBRP has a good performance in terms of network lifetime and packet delivery,also strikes a better balance between successful packet reception and energy consumption.

  2. Social and Symbolic Capital in Firm Clusters

    DEFF Research Database (Denmark)

    Gretzinger, Susanne; Royer, Susanne

    Based on a relational perspective this paper analyses the case of the “Mechatronics Cluster” in Southern Jutland, Denmark. We found that cluster managers are not aware of the importance of social and symbolic capital. Cluster managers could have access to both but they are not aware...... of this resource and they don´t have any knowledge how to manage social and symbolic capital. Just to integrate social-capital-supporting initiatives in the day to day business would help to develop and to foster social and symbolic capital on a low cost level. And in our example just to integrate successful sub...

  3. Persistent Spatial Clusters of Prescribed Antimicrobials among Danish Pig Farms – A Register-Based Study

    Science.gov (United States)

    Fertner, Mette; Sanchez, Javier; Boklund, Anette; Stryhn, Henrik; Dupont, Nana; Toft, Nils

    2015-01-01

    The emergence of pathogens resistant to antimicrobials has prompted political initiatives targeting a reduction in the use of veterinary antimicrobials in Denmark, especially for pigs. This study elucidates the tendency of pig farms with a significantly higher antimicrobial use to remain in clusters in certain geographical regions of Denmark. Animal Daily Doses/100 pigs/day were calculated for all three age groups of pigs (weaners, finishers and sows) for each quarter during 2012–13 in 6,143 commercial indoor pig producing farms. The data were split into four time periods of six months. Repeated spatial cluster analyses were performed to identify persistent clusters, i.e. areas included in a significant cluster throughout all four time periods. Antimicrobials prescribed for weaners did not result in any persistent clusters. In contrast, antimicrobial use in finishers clustered persistently in two areas (157 farms), while those issued for sows clustered in one area (51 farms). A multivariate analysis including data on antimicrobial use for weaners, finishers and sows as three separate outcomes resulted in three persistent clusters (551 farms). Compared to farms outside the clusters during this period, weaners, finishers and sows on farms within these clusters had 19%, 104% and 4% higher use of antimicrobials, respectively. Production type, farm type and farm size seemed to have some bearing on the clustering effect. Adding these factors as categorical covariates one at a time in the multivariate analysis reduced the persistent clusters by 24.3%, 30.5% and 34.1%, respectively. PMID:26317206

  4. An adaptive clustering algorithm for image matching based on corner feature

    Science.gov (United States)

    Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song

    2018-04-01

    The traditional image matching algorithm always can not balance the real-time and accuracy better, to solve the problem, an adaptive clustering algorithm for image matching based on corner feature is proposed in this paper. The method is based on the similarity of the matching pairs of vector pairs, and the adaptive clustering is performed on the matching point pairs. Harris corner detection is carried out first, the feature points of the reference image and the perceived image are extracted, and the feature points of the two images are first matched by Normalized Cross Correlation (NCC) function. Then, using the improved algorithm proposed in this paper, the matching results are clustered to reduce the ineffective operation and improve the matching speed and robustness. Finally, the Random Sample Consensus (RANSAC) algorithm is used to match the matching points after clustering. The experimental results show that the proposed algorithm can effectively eliminate the most wrong matching points while the correct matching points are retained, and improve the accuracy of RANSAC matching, reduce the computation load of whole matching process at the same time.

  5. Generating clustered scale-free networks using Poisson based localization of edges

    Science.gov (United States)

    Türker, İlker

    2018-05-01

    We introduce a variety of network models using a Poisson-based edge localization strategy, which result in clustered scale-free topologies. We first verify the success of our localization strategy by realizing a variant of the well-known Watts-Strogatz model with an inverse approach, implying a small-world regime of rewiring from a random network through a regular one. We then apply the rewiring strategy to a pure Barabasi-Albert model and successfully achieve a small-world regime, with a limited capacity of scale-free property. To imitate the high clustering property of scale-free networks with higher accuracy, we adapted the Poisson-based wiring strategy to a growing network with the ingredients of both preferential attachment and local connectivity. To achieve the collocation of these properties, we used a routine of flattening the edges array, sorting it, and applying a mixing procedure to assemble both global connections with preferential attachment and local clusters. As a result, we achieved clustered scale-free networks with a computational fashion, diverging from the recent studies by following a simple but efficient approach.

  6. Effective Social Relationship Measurement and Cluster Based Routing in Mobile Opportunistic Networks †

    Science.gov (United States)

    Zeng, Feng; Zhao, Nan; Li, Wenjia

    2017-01-01

    In mobile opportunistic networks, the social relationship among nodes has an important impact on data transmission efficiency. Motivated by the strong share ability of “circles of friends” in communication networks such as Facebook, Twitter, Wechat and so on, we take a real-life example to show that social relationships among nodes consist of explicit and implicit parts. The explicit part comes from direct contact among nodes, and the implicit part can be measured through the “circles of friends”. We present the definitions of explicit and implicit social relationships between two nodes, adaptive weights of explicit and implicit parts are given according to the contact feature of nodes, and the distributed mechanism is designed to construct the “circles of friends” of nodes, which is used for the calculation of the implicit part of social relationship between nodes. Based on effective measurement of social relationships, we propose a social-based clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, and the self-control method is used to keep all cluster members always having close relationships with each other. A cluster-based message forwarding mechanism is designed for opportunistic routing, in which each node only forwards the copy of the message to nodes with the destination node as a member of the local cluster. Simulation results show that the proposed social-based clustering and routing outperforms the other classic routing algorithms. PMID:28498309

  7. Unsupervised Performance Evaluation Strategy for Bridge Superstructure Based on Fuzzy Clustering and Field Data

    Directory of Open Access Journals (Sweden)

    Yubo Jiao

    2013-01-01

    Full Text Available Performance evaluation of a bridge is critical for determining the optimal maintenance strategy. An unsupervised bridge superstructure state assessment method is proposed in this paper based on fuzzy clustering and bridge field measured data. Firstly, the evaluation index system of bridge is constructed. Secondly, a certain number of bridge health monitoring data are selected as clustering samples to obtain the fuzzy similarity matrix and fuzzy equivalent matrix. Finally, different thresholds are selected to form dynamic clustering maps and determine the best classification based on statistic analysis. The clustering result is regarded as a sample base, and the bridge state can be evaluated by calculating the fuzzy nearness between the unknown bridge state data and the sample base. Nanping Bridge in Jilin Province is selected as the engineering project to verify the effectiveness of the proposed method.

  8. Recognition of genetically modified product based on affinity propagation clustering and terahertz spectroscopy

    Science.gov (United States)

    Liu, Jianjun; Kan, Jianquan

    2018-04-01

    In this paper, based on the terahertz spectrum, a new identification method of genetically modified material by support vector machine (SVM) based on affinity propagation clustering is proposed. This algorithm mainly uses affinity propagation clustering algorithm to make cluster analysis and labeling on unlabeled training samples, and in the iterative process, the existing SVM training data are continuously updated, when establishing the identification model, it does not need to manually label the training samples, thus, the error caused by the human labeled samples is reduced, and the identification accuracy of the model is greatly improved.

  9. Biological consequences of potential repair intermediates of clustered base damage site in Escherichia coli

    Energy Technology Data Exchange (ETDEWEB)

    Shikazono, Naoya, E-mail: shikazono.naoya@jaea.go.jp [Japan Atomic Energy Agency, Advanced Research Science Center, 2-4 Shirakata-Shirane, Tokai-mura, Naka-gun, Ibaraki 319-1195 (Japan); O' Neill, Peter [Gray Institute for Radiation Oncology and Biology, University of Oxford, Roosevelt Drive, Oxford OX3 7DQ (United Kingdom)

    2009-10-02

    Clustered DNA damage induced by a single radiation track is a unique feature of ionizing radiation. Using a plasmid-based assay in Escherichia coli, we previously found significantly higher mutation frequencies for bistranded clusters containing 7,8-dihydro-8-oxoguanine (8-oxoG) and 5,6-dihydrothymine (DHT) than for either a single 8-oxoG or a single DHT in wild type and in glycosylase-deficient strains of E. coli. This indicates that the removal of an 8-oxoG from a clustered damage site is most likely retarded compared to the removal of a single 8-oxoG. To gain further insights into the processing of bistranded base lesions, several potential repair intermediates following 8-oxoG removal were assessed. Clusters, such as DHT + apurinic/apyrimidinic (AP) and DHT + GAP have relatively low mutation frequencies, whereas clusters, such as AP + AP or GAP + AP, significantly reduce the number of transformed colonies, most probably through formation of a lethal double strand break (DSB). Bistranded AP sites placed 3' to each other with various interlesion distances also blocked replication. These results suggest that bistranded base lesions, i.e., single base lesions on each strand, but not clusters containing only AP sites and strand breaks, are repaired in a coordinated manner so that the formation of DSBs is avoided. We propose that, when either base lesion is initially excised from a bistranded base damage site, the remaining base lesion will only rarely be converted into an AP site or a single strand break in vivo.

  10. Biological consequences of potential repair intermediates of clustered base damage site in Escherichia coli

    International Nuclear Information System (INIS)

    Shikazono, Naoya; O'Neill, Peter

    2009-01-01

    Clustered DNA damage induced by a single radiation track is a unique feature of ionizing radiation. Using a plasmid-based assay in Escherichia coli, we previously found significantly higher mutation frequencies for bistranded clusters containing 7,8-dihydro-8-oxoguanine (8-oxoG) and 5,6-dihydrothymine (DHT) than for either a single 8-oxoG or a single DHT in wild type and in glycosylase-deficient strains of E. coli. This indicates that the removal of an 8-oxoG from a clustered damage site is most likely retarded compared to the removal of a single 8-oxoG. To gain further insights into the processing of bistranded base lesions, several potential repair intermediates following 8-oxoG removal were assessed. Clusters, such as DHT + apurinic/apyrimidinic (AP) and DHT + GAP have relatively low mutation frequencies, whereas clusters, such as AP + AP or GAP + AP, significantly reduce the number of transformed colonies, most probably through formation of a lethal double strand break (DSB). Bistranded AP sites placed 3' to each other with various interlesion distances also blocked replication. These results suggest that bistranded base lesions, i.e., single base lesions on each strand, but not clusters containing only AP sites and strand breaks, are repaired in a coordinated manner so that the formation of DSBs is avoided. We propose that, when either base lesion is initially excised from a bistranded base damage site, the remaining base lesion will only rarely be converted into an AP site or a single strand break in vivo.

  11. Modulated modularity clustering as an exploratory tool for functional genomic inference.

    Directory of Open Access Journals (Sweden)

    Eric A Stone

    2009-05-01

    Full Text Available In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC, seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation.

  12. Particle Swarm Optimization and harmony search based clustering and routing in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Veena Anand

    2017-01-01

    Full Text Available Wireless Sensor Networks (WSN has the disadvantage of limited and non-rechargeable energy resource in WSN creates a challenge and led to development of various clustering and routing algorithms. The paper proposes an approach for improving network lifetime by using Particle swarm optimization based clustering and Harmony Search based routing in WSN. So in this paper, global optimal cluster head are selected and Gateway nodes are introduced to decrease the energy consumption of the CH while sending aggregated data to the Base station (BS. Next, the harmony search algorithm based Local Search strategy finds best routing path for gateway nodes to the Base Station. Finally, the proposed algorithm is presented.

  13. Environment-based selection effects of Planck clusters

    Energy Technology Data Exchange (ETDEWEB)

    Kosyra, R.; Gruen, D.; Seitz, S.; Mana, A.; Rozo, E.; Rykoff, E.; Sanchez, A.; Bender, R.

    2015-07-24

    We investigate whether the large-scale structure environment of galaxy clusters imprints a selection bias on Sunyaev–Zel'dovich (SZ) catalogues. Such a selection effect might be caused by line of sight (LoS) structures that add to the SZ signal or contain point sources that disturb the signal extraction in the SZ survey. We use the Planck PSZ1 union catalogue in the Sloan Digital Sky Survey (SDSS) region as our sample of SZ-selected clusters. We calculate the angular two-point correlation function (2pcf) for physically correlated, foreground and background structure in the RedMaPPer SDSS DR8 catalogue with respect to each cluster. We compare our results with an optically selected comparison cluster sample and with theoretical predictions. In contrast to the hypothesis of no environment-based selection, we find a mean 2pcf for background structures of -0.049 on scales of ≲40 arcmin, significantly non-zero at ~4σ, which means that Planck clusters are more likely to be detected in regions of low background density. We hypothesize this effect arises either from background estimation in the SZ survey or from radio sources in the background. We estimate the defect in SZ signal caused by this effect to be negligibly small, of the order of ~10-4 of the signal of a typical Planck detection. Analogously, there are no implications on X-ray mass measurements. However, the environmental dependence has important consequences for weak lensing follow up of Planck galaxy clusters: we predict that projection effects account for half of the mass contained within a 15 arcmin radius of Planck galaxy clusters. We did not detect a background underdensity of CMASS LRGs, which also leaves a spatially varying redshift dependence of the Planck SZ selection function as a possible cause for our findings.

  14. Investigating role stress in frontline bank employees: A cluster based approach

    Directory of Open Access Journals (Sweden)

    Arti Devi

    2013-09-01

    Full Text Available An effective role stress management programme would benefit from a segmentation of employees based on their experience of role stressors. This study explores role stressor based segments of frontline bank employees towards providing a framework for designing such a programme. Cluster analysis on a random sample of 501 frontline employees of commercial banks in Jammu and Kashmir (India revealed three distinct segments – “overloaded employees”, “unclear employees”, and “underutilised employees”, based on their experience of role stressors. The findings suggest a customised approach to role stress management, with the role stress management programme designed to address cluster specific needs.

  15. Structure and Sequence Analyses of Clustered Protocadherins Reveal Antiparallel Interactions that Mediate Homophilic Specificity.

    Science.gov (United States)

    Nicoludis, John M; Lau, Sze-Yi; Schärfe, Charlotta P I; Marks, Debora S; Weihofen, Wilhelm A; Gaudet, Rachelle

    2015-11-03

    Clustered protocadherin (Pcdh) proteins mediate dendritic self-avoidance in neurons via specific homophilic interactions in their extracellular cadherin (EC) domains. We determined crystal structures of EC1-EC3, containing the homophilic specificity-determining region, of two mouse clustered Pcdh isoforms (PcdhγA1 and PcdhγC3) to investigate the nature of the homophilic interaction. Within the crystal lattices, we observe antiparallel interfaces consistent with a role in trans cell-cell contact. Antiparallel dimerization is supported by evolutionary correlations. Two interfaces, located primarily on EC2-EC3, involve distinctive clustered Pcdh structure and sequence motifs, lack predicted glycosylation sites, and contain residues highly conserved in orthologs but not paralogs, pointing toward their biological significance as homophilic interaction interfaces. These two interfaces are similar yet distinct, reflecting a possible difference in interaction architecture between clustered Pcdh subfamilies. These structures initiate a molecular understanding of clustered Pcdh assemblies that are required to produce functional neuronal networks. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Distribution-based fuzzy clustering of electrical resistivity tomography images for interface detection

    Science.gov (United States)

    Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.

    2014-04-01

    A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.

  17. Internet2-based 3D PET image reconstruction using a PC cluster

    International Nuclear Information System (INIS)

    Shattuck, D.W.; Rapela, J.; Asma, E.; Leahy, R.M.; Chatzioannou, A.; Qi, J.

    2002-01-01

    We describe an approach to fast iterative reconstruction from fully three-dimensional (3D) PET data using a network of PentiumIII PCs configured as a Beowulf cluster. To facilitate the use of this system, we have developed a browser-based interface using Java. The system compresses PET data on the user's machine, sends these data over a network, and instructs the PC cluster to reconstruct the image. The cluster implements a parallelized version of our preconditioned conjugate gradient method for fully 3D MAP image reconstruction. We report on the speed-up factors using the Beowulf approach and the impacts of communication latencies in the local cluster network and the network connection between the user's machine and our PC cluster. (author)

  18. Clustering Batik Images using Fuzzy C-Means Algorithm Based on Log-Average Luminance

    Directory of Open Access Journals (Sweden)

    Ahmad Sanmorino

    2012-06-01

    Full Text Available Batik is a fabric or clothes that are made ​​with a special staining technique called wax-resist dyeing and is one of the cultural heritage which has high artistic value. In order to improve the efficiency and give better semantic to the image, some researchers apply clustering algorithm for managing images before they can be retrieved. Image clustering is a process of grouping images based on their similarity. In this paper we attempt to provide an alternative method of grouping batik image using fuzzy c-means (FCM algorithm based on log-average luminance of the batik. FCM clustering algorithm is an algorithm that works using fuzzy models that allow all data from all cluster members are formed with different degrees of membership between 0 and 1. Log-average luminance (LAL is the average value of the lighting in an image. We can compare different image lighting from one image to another using LAL. From the experiments that have been made, it can be concluded that fuzzy c-means algorithm can be used for batik image clustering based on log-average luminance of each image possessed.

  19. Exploitation of Clustering Techniques in Transactional Healthcare Data

    Directory of Open Access Journals (Sweden)

    Naeem Ahmad Mahoto

    2014-03-01

    Full Text Available Healthcare service centres equipped with electronic health systems have improved their resources as well as treatment processes. The dynamic nature of healthcare data of each individual makes it complex and difficult for physicians to manually mediate them; therefore, automatic techniques are essential to manage the quality and standardization of treatment procedures. Exploratory data analysis, patternanalysis and grouping of data is managed using clustering techniques, which work as an unsupervised classification. A number of healthcare applications are developed that use several data mining techniques for classification, clustering and extracting useful information from healthcare data. The challenging issue in this domain is to select adequate data mining algorithm for optimal results. This paper exploits three different clustering algorithms: DBSCAN (Density-Based Clustering, agglomerative hierarchical and k-means in real transactional healthcare data of diabetic patients (taken as case study to analyse their performance in large and dispersed healthcare data. The best solution of cluster sets among the exploited algorithms is evaluated using clustering quality indexes and is selected to identify the possible subgroups of patients having similar treatment patterns

  20. A user credit assessment model based on clustering ensemble for broadband network new media service supervision

    Science.gov (United States)

    Liu, Fang; Cao, San-xing; Lu, Rui

    2012-04-01

    This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.

  1. Constraints on Ωm and σ8 from the potential-based cluster temperature function

    Science.gov (United States)

    Angrick, Christian; Pace, Francesco; Bartelmann, Matthias; Roncarelli, Mauro

    2015-12-01

    The abundance of galaxy clusters is in principle a powerful tool to constrain cosmological parameters, especially Ωm and σ8, due to the exponential dependence in the high-mass regime. While the best observables are the X-ray temperature and luminosity, the abundance of galaxy clusters, however, is conventionally predicted as a function of mass. Hence, the intrinsic scatter and the uncertainties in the scaling relations between mass and either temperature or luminosity lower the reliability of galaxy clusters to constrain cosmological parameters. In this article, we further refine the X-ray temperature function for galaxy clusters by Angrick et al., which is based on the statistics of perturbations in the cosmic gravitational potential and proposed to replace the classical mass-based temperature function, by including a refined analytic merger model and compare the theoretical prediction to results from a cosmological hydrodynamical simulation. Although we find already a good agreement if we compare with a cluster temperature function based on the mass-weighted temperature, including a redshift-dependent scaling between mass-based and spectroscopic temperature yields even better agreement between theoretical model and numerical results. As a proof of concept, incorporating this additional scaling in our model, we constrain the cosmological parameters Ωm and σ8 from an X-ray sample of galaxy clusters and tentatively find agreement with the recent cosmic microwave background based results from the Planck mission at 1σ-level.

  2. Collaborative filtering recommendation model based on fuzzy clustering algorithm

    Science.gov (United States)

    Yang, Ye; Zhang, Yunhua

    2018-05-01

    As one of the most widely used algorithms in recommender systems, collaborative filtering algorithm faces two serious problems, which are the sparsity of data and poor recommendation effect in big data environment. In traditional clustering analysis, the object is strictly divided into several classes and the boundary of this division is very clear. However, for most objects in real life, there is no strict definition of their forms and attributes of their class. Concerning the problems above, this paper proposes to improve the traditional collaborative filtering model through the hybrid optimization of implicit semantic algorithm and fuzzy clustering algorithm, meanwhile, cooperating with collaborative filtering algorithm. In this paper, the fuzzy clustering algorithm is introduced to fuzzy clustering the information of project attribute, which makes the project belong to different project categories with different membership degrees, and increases the density of data, effectively reduces the sparsity of data, and solves the problem of low accuracy which is resulted from the inaccuracy of similarity calculation. Finally, this paper carries out empirical analysis on the MovieLens dataset, and compares it with the traditional user-based collaborative filtering algorithm. The proposed algorithm has greatly improved the recommendation accuracy.

  3. Neural network based cluster creation in the ATLAS silicon pixel detector

    CERN Document Server

    Selbach, K E; The ATLAS collaboration

    2012-01-01

    The read-out from individual pixels on planar semi-conductor sensors are grouped into clusters to reconstruct the location where a charged particle passed through the sensor. The resolution given by individual pixel sizes is significantly improved by using the information from the charge sharing between pixels. Such analog cluster creation techniques have been used by the ATLAS experiment for many years to obtain an excellent performance. However, in dense environments, such as those inside high-energy jets, clusters have an increased probability of merging the charge deposited by multiple particles. Recently, a neural network based algorithm which estimates both the cluster position and whether a cluster should be split has been developed for the ATLAS pixel detector. The algorithm significantly reduces ambiguities in the assignment of pixel detector measurement to tracks within jets and improves the position accuracy with respect to standard interpolation techniques by taking into account the 2-dimensional ...

  4. Neural network based cluster creation in the ATLAS silicon Pixel Detector

    CERN Document Server

    Andreazza, A; The ATLAS collaboration

    2013-01-01

    The read-out from individual pixels on planar semi-conductor sensors are grouped into clusters to reconstruct the location where a charged particle passed through the sensor. The resolution given by individual pixel sizes is significantly improved by using the information from the charge sharing between pixels. Such analog cluster creation techniques have been used by the ATLAS experiment for many years to obtain an excellent performance. However, in dense environments, such as those inside high-energy jets, clusters have an increased probability of merging the charge deposited by multiple particles. Recently, a neural network based algorithm which estimates both the cluster position and whether a cluster should be split has been developed for the ATLAS Pixel Detector. The algorithm significantly reduces ambiguities in the assignment of pixel detector measurement to tracks within jets and improves the position accuracy with respect to standard interpolation techniques by taking into account the 2-dimensional ...

  5. Reconstruction of a digital core containing clay minerals based on a clustering algorithm.

    Science.gov (United States)

    He, Yanlong; Pu, Chunsheng; Jing, Cheng; Gu, Xiaoyu; Chen, Qingdong; Liu, Hongzhi; Khan, Nasir; Dong, Qiaoling

    2017-10-01

    It is difficult to obtain a core sample and information for digital core reconstruction of mature sandstone reservoirs around the world, especially for an unconsolidated sandstone reservoir. Meanwhile, reconstruction and division of clay minerals play a vital role in the reconstruction of the digital cores, although the two-dimensional data-based reconstruction methods are specifically applicable as the microstructure reservoir simulation methods for the sandstone reservoir. However, reconstruction of clay minerals is still challenging from a research viewpoint for the better reconstruction of various clay minerals in the digital cores. In the present work, the content of clay minerals was considered on the basis of two-dimensional information about the reservoir. After application of the hybrid method, and compared with the model reconstructed by the process-based method, the digital core containing clay clusters without the labels of the clusters' number, size, and texture were the output. The statistics and geometry of the reconstruction model were similar to the reference model. In addition, the Hoshen-Kopelman algorithm was used to label various connected unclassified clay clusters in the initial model and then the number and size of clay clusters were recorded. At the same time, the K-means clustering algorithm was applied to divide the labeled, large connecting clusters into smaller clusters on the basis of difference in the clusters' characteristics. According to the clay minerals' characteristics, such as types, textures, and distributions, the digital core containing clay minerals was reconstructed by means of the clustering algorithm and the clay clusters' structure judgment. The distributions and textures of the clay minerals of the digital core were reasonable. The clustering algorithm improved the digital core reconstruction and provided an alternative method for the simulation of different clay minerals in the digital cores.

  6. Reconstruction of a digital core containing clay minerals based on a clustering algorithm

    Science.gov (United States)

    He, Yanlong; Pu, Chunsheng; Jing, Cheng; Gu, Xiaoyu; Chen, Qingdong; Liu, Hongzhi; Khan, Nasir; Dong, Qiaoling

    2017-10-01

    It is difficult to obtain a core sample and information for digital core reconstruction of mature sandstone reservoirs around the world, especially for an unconsolidated sandstone reservoir. Meanwhile, reconstruction and division of clay minerals play a vital role in the reconstruction of the digital cores, although the two-dimensional data-based reconstruction methods are specifically applicable as the microstructure reservoir simulation methods for the sandstone reservoir. However, reconstruction of clay minerals is still challenging from a research viewpoint for the better reconstruction of various clay minerals in the digital cores. In the present work, the content of clay minerals was considered on the basis of two-dimensional information about the reservoir. After application of the hybrid method, and compared with the model reconstructed by the process-based method, the digital core containing clay clusters without the labels of the clusters' number, size, and texture were the output. The statistics and geometry of the reconstruction model were similar to the reference model. In addition, the Hoshen-Kopelman algorithm was used to label various connected unclassified clay clusters in the initial model and then the number and size of clay clusters were recorded. At the same time, the K -means clustering algorithm was applied to divide the labeled, large connecting clusters into smaller clusters on the basis of difference in the clusters' characteristics. According to the clay minerals' characteristics, such as types, textures, and distributions, the digital core containing clay minerals was reconstructed by means of the clustering algorithm and the clay clusters' structure judgment. The distributions and textures of the clay minerals of the digital core were reasonable. The clustering algorithm improved the digital core reconstruction and provided an alternative method for the simulation of different clay minerals in the digital cores.

  7. Research on Bridge Sensor Validation Based on Correlation in Cluster

    Directory of Open Access Journals (Sweden)

    Huang Xiaowei

    2016-01-01

    Full Text Available In order to avoid the false alarm and alarm failure caused by sensor malfunction or failure, it has been critical to diagnose the fault and analyze the failure of the sensor measuring system in major infrastructures. Based on the real time monitoring of bridges and the study on the correlation probability distribution between multisensors adopted in the fault diagnosis system, a clustering algorithm based on k-medoid is proposed, by dividing sensors of the same type into k clusters. Meanwhile, the value of k is optimized by a specially designed evaluation function. Along with the further study of the correlation of sensors within the same cluster, this paper presents the definition and corresponding calculation algorithm of the sensor’s validation. The algorithm is applied to the analysis of the sensor data from an actual health monitoring system. The result reveals that the algorithm can not only accurately measure the failure degree and orientate the malfunction in time domain but also quantitatively evaluate the performance of sensors and eliminate error of diagnosis caused by the failure of the reference sensor.

  8. Cosmological constraints with clustering-based redshifts

    Science.gov (United States)

    Kovetz, Ely D.; Raccanelli, Alvise; Rahman, Mubdi

    2017-07-01

    We demonstrate that observations lacking reliable redshift information, such as photometric and radio continuum surveys, can produce robust measurements of cosmological parameters when empowered by clustering-based redshift estimation. This method infers the redshift distribution based on the spatial clustering of sources, using cross-correlation with a reference data set with known redshifts. Applying this method to the existing Sloan Digital Sky Survey (SDSS) photometric galaxies, and projecting to future radio continuum surveys, we show that sources can be efficiently divided into several redshift bins, increasing their ability to constrain cosmological parameters. We forecast constraints on the dark-energy equation of state and on local non-Gaussianity parameters. We explore several pertinent issues, including the trade-off between including more sources and minimizing the overlap between bins, the shot-noise limitations on binning and the predicted performance of the method at high redshifts, and most importantly pay special attention to possible degeneracies with the galaxy bias. Remarkably, we find that once this technique is implemented, constraints on dynamical dark energy from the SDSS imaging catalogue can be competitive with, or better than, those from the spectroscopic BOSS survey and even future planned experiments. Further, constraints on primordial non-Gaussianity from future large-sky radio-continuum surveys can outperform those from the Planck cosmic microwave background experiment and rival those from future spectroscopic galaxy surveys. The application of this method thus holds tremendous promise for cosmology.

  9. Possible world based consistency learning model for clustering and classifying uncertain data.

    Science.gov (United States)

    Liu, Han; Zhang, Xianchao; Zhang, Xiaotong

    2018-06-01

    Possible world has shown to be effective for handling various types of data uncertainty in uncertain data management. However, few uncertain data clustering and classification algorithms are proposed based on possible world. Moreover, existing possible world based algorithms suffer from the following issues: (1) they deal with each possible world independently and ignore the consistency principle across different possible worlds; (2) they require the extra post-processing procedure to obtain the final result, which causes that the effectiveness highly relies on the post-processing method and the efficiency is also not very good. In this paper, we propose a novel possible world based consistency learning model for uncertain data, which can be extended both for clustering and classifying uncertain data. This model utilizes the consistency principle to learn a consensus affinity matrix for uncertain data, which can make full use of the information across different possible worlds and then improve the clustering and classification performance. Meanwhile, this model imposes a new rank constraint on the Laplacian matrix of the consensus affinity matrix, thereby ensuring that the number of connected components in the consensus affinity matrix is exactly equal to the number of classes. This also means that the clustering and classification results can be directly obtained without any post-processing procedure. Furthermore, for the clustering and classification tasks, we respectively derive the efficient optimization methods to solve the proposed model. Experimental results on real benchmark datasets and real world uncertain datasets show that the proposed model outperforms the state-of-the-art uncertain data clustering and classification algorithms in effectiveness and performs competitively in efficiency. Copyright © 2018 Elsevier Ltd. All rights reserved.

  10. The Evolution of cluster concept in Catalonia : the case of Cork Cluster – AECORK

    OpenAIRE

    Serarols Tarrés, Joyce

    2015-01-01

    The aim of this final degree project is two-fold: first, to introduce the evolutionary concept of the cluster in Catalonia from a strategic perspective and, second, to analyse the case of the Catalan Cork cluster located in the province of Girona, that is, the northeast of Spain

  11. Hierarchical video summarization based on context clustering

    Science.gov (United States)

    Tseng, Belle L.; Smith, John R.

    2003-11-01

    A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.

  12. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  13. K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data

    Directory of Open Access Journals (Sweden)

    Cheng Lu

    2015-01-01

    Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.

  14. Cluster management.

    Science.gov (United States)

    Katz, R

    1992-11-01

    Cluster management is a management model that fosters decentralization of management, develops leadership potential of staff, and creates ownership of unit-based goals. Unlike shared governance models, there is no formal structure created by committees and it is less threatening for managers. There are two parts to the cluster management model. One is the formation of cluster groups, consisting of all staff and facilitated by a cluster leader. The cluster groups function for communication and problem-solving. The second part of the cluster management model is the creation of task forces. These task forces are designed to work on short-term goals, usually in response to solving one of the unit's goals. Sometimes the task forces are used for quality improvement or system problems. Clusters are groups of not more than five or six staff members, facilitated by a cluster leader. A cluster is made up of individuals who work the same shift. For example, people with job titles who work days would be in a cluster. There would be registered nurses, licensed practical nurses, nursing assistants, and unit clerks in the cluster. The cluster leader is chosen by the manager based on certain criteria and is trained for this specialized role. The concept of cluster management, criteria for choosing leaders, training for leaders, using cluster groups to solve quality improvement issues, and the learning process necessary for manager support are described.

  15. Energy-Efficient Cluster Based Routing Protocol in Mobile Ad Hoc Networks Using Network Coding

    Directory of Open Access Journals (Sweden)

    Srinivas Kanakala

    2014-01-01

    Full Text Available In mobile ad hoc networks, all nodes are energy constrained. In such situations, it is important to reduce energy consumption. In this paper, we consider the issues of energy efficient communication in MANETs using network coding. Network coding is an effective method to improve the performance of wireless networks. COPE protocol implements network coding concept to reduce number of transmissions by mixing the packets at intermediate nodes. We incorporate COPE into cluster based routing protocol to further reduce the energy consumption. The proposed energy-efficient coding-aware cluster based routing protocol (ECCRP scheme applies network coding at cluster heads to reduce number of transmissions. We also modify the queue management procedure of COPE protocol to further improve coding opportunities. We also use an energy efficient scheme while selecting the cluster head. It helps to increase the life time of the network. We evaluate the performance of proposed energy efficient cluster based protocol using simulation. Simulation results show that the proposed ECCRP algorithm reduces energy consumption and increases life time of the network.

  16. MODEL-BASED CLUSTERING FOR CLASSIFICATION OF AQUATIC SYSTEMS AND DIAGNOSIS OF ECOLOGICAL STRESS

    Science.gov (United States)

    Clustering approaches were developed using the classification likelihood, the mixture likelihood, and also using a randomization approach with a model index. Using a clustering approach based on the mixture and classification likelihoods, we have developed an algorithm that...

  17. Stigmergy based behavioural coordination for satellite clusters

    Science.gov (United States)

    Tripp, Howard; Palmer, Phil

    2010-04-01

    Multi-platform swarm/cluster missions are an attractive prospect for improved science return as they provide a natural capability for temporal, spatial and signal separation with further engineering and economic advantages. As spacecraft numbers increase and/or the round-trip communications delay from Earth lengthens, the traditional "remote-control" approach begins to break down. It is therefore essential to push control into space; to make spacecraft more autonomous. An autonomous group of spacecraft requires coordination, but standard terrestrial paradigms such as negotiation, require high levels of inter-spacecraft communication, which is nontrivial in space. This article therefore introduces the principals of stigmergy as a novel method for coordinating a cluster. Stigmergy is an agent-based, behavioural approach that allows for infrequent communication with decisions based on local information. Behaviours are selected dynamically using a genetic algorithm onboard. supervisors/ground stations occasionally adjust parameters and disseminate a "common environment" that is used for local decisions. After outlining the system, an analysis of some crucial parameters such as communications overhead and number of spacecraft is presented to demonstrate scalability. Further scenarios are considered to demonstrate the natural ability to deal with dynamic situations such as the failure of spacecraft, changing mission objectives and responding to sudden bursts of high priority tasks.

  18. Subtypes of autism by cluster analysis based on structural MRI data.

    Science.gov (United States)

    Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas

    2005-05-01

    The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

  19. Improved regional-scale Brazilian cropping systems' mapping based on a semi-automatic object-based clustering approach

    Science.gov (United States)

    Bellón, Beatriz; Bégué, Agnès; Lo Seen, Danny; Lebourgeois, Valentine; Evangelista, Balbino Antônio; Simões, Margareth; Demonte Ferraz, Rodrigo Peçanha

    2018-06-01

    Cropping systems' maps at fine scale over large areas provide key information for further agricultural production and environmental impact assessments, and thus represent a valuable tool for effective land-use planning. There is, therefore, a growing interest in mapping cropping systems in an operational manner over large areas, and remote sensing approaches based on vegetation index time series analysis have proven to be an efficient tool. However, supervised pixel-based approaches are commonly adopted, requiring resource consuming field campaigns to gather training data. In this paper, we present a new object-based unsupervised classification approach tested on an annual MODIS 16-day composite Normalized Difference Vegetation Index time series and a Landsat 8 mosaic of the State of Tocantins, Brazil, for the 2014-2015 growing season. Two variants of the approach are compared: an hyperclustering approach, and a landscape-clustering approach involving a previous stratification of the study area into landscape units on which the clustering is then performed. The main cropping systems of Tocantins, characterized by the crop types and cropping patterns, were efficiently mapped with the landscape-clustering approach. Results show that stratification prior to clustering significantly improves the classification accuracies for underrepresented and sparsely distributed cropping systems. This study illustrates the potential of unsupervised classification for large area cropping systems' mapping and contributes to the development of generic tools for supporting large-scale agricultural monitoring across regions.

  20. The Effect of Cluster-Based Instruction on Mathematic Achievement in Inclusive Schools

    Science.gov (United States)

    Gunarhadi, Sunardi; Anwar, Mohammad; Andayani, Tri Rejeki; Shaari, Abdull Sukor

    2016-01-01

    The research aimed to investigate the effect of Cluster-Based Instruction (CBI) on the academic achievement of Mathematics in inclusive schools. The sample was 68 students in two intact classes, including those with learning disabilities, selected using a cluster random technique among 17 inclusive schools in the regency of Surakarta. The two…

  1. ARE SMALL-FIRM CLUSTERS EMERGENT PHENOMENA? EVIDENCE FROM ZIMBABWE’S SMALL FURNITURE- MANUFACTURING FIRMS

    Directory of Open Access Journals (Sweden)

    Godfrey MUPONDA

    2014-07-01

    Full Text Available The purpose of this study was to explore the reasons behind the rapid growth and apparent dynamism of Zimbabwe’s small-firm industrial clusters. The hypothesis behind the study was that these small-firm clusters are emergent phenomena. The study analysed the capital utilisation techniques of small firms located in a large industrial cluster in order to determine the factors that lead to the collective efficiency of such firms. The study found that, in comparison with large, stock exchange-listed firms, the cluster environment enables the small firm to operate from a relatively small capital base and also to use its capital more efficiently in creating revenues and profits. The individual firm does not have to invest its capital in a large assets base as this is done by a specialised group of firms within the cluster. Thus, the cluster has the characteristics of an emergent phenomenon.

  2. Core Business Selection Based on Ant Colony Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Yu Lan

    2014-01-01

    Full Text Available Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant colony clustering algorithm. Thus, the results indicate that the proposed method is an effective way to determine the core business for company.

  3. GENERALISED MODEL BASED CONFIDENCE INTERVALS IN TWO STAGE CLUSTER SAMPLING

    Directory of Open Access Journals (Sweden)

    Christopher Ouma Onyango

    2010-09-01

    Full Text Available Chambers and Dorfman (2002 constructed bootstrap confidence intervals in model based estimation for finite population totals assuming that auxiliary values are available throughout a target population and that the auxiliary values are independent. They also assumed that the cluster sizes are known throughout the target population. We now extend to two stage sampling in which the cluster sizes are known only for the sampled clusters, and we therefore predict the unobserved part of the population total. Jan and Elinor (2008 have done similar work, but unlike them, we use a general model, in which the auxiliary values are not necessarily independent. We demonstrate that the asymptotic properties of our proposed estimator and its coverage rates are better than those constructed under the model assisted local polynomial regression model.

  4. Multi-documents summarization based on clustering of learning object using hierarchical clustering

    Science.gov (United States)

    Mustamiin, M.; Budi, I.; Santoso, H. B.

    2018-03-01

    The Open Educational Resources (OER) is a portal of teaching, learning and research resources that is available in public domain and freely accessible. Learning contents or Learning Objects (LO) are granular and can be reused for constructing new learning materials. LO ontology-based searching techniques can be used to search for LO in the Indonesia OER. In this research, LO from search results are used as an ingredient to create new learning materials according to the topic searched by users. Summarizing-based grouping of LO use Hierarchical Agglomerative Clustering (HAC) with the dependency context to the user’s query which has an average value F-Measure of 0.487, while summarizing by K-Means F-Measure only has an average value of 0.336.

  5. Progressive Amalgamation of Building Clusters for Map Generalization Based on Scaling Subgroups

    Directory of Open Access Journals (Sweden)

    Xianjin He

    2018-03-01

    Full Text Available Map generalization utilizes transformation operations to derive smaller-scale maps from larger-scale maps, and is a key procedure for the modelling and understanding of geographic space. Studies to date have largely applied a fixed tolerance to aggregate clustered buildings into a single object, resulting in the loss of details that meet cartographic constraints and may be of importance for users. This study aims to develop a method that amalgamates clustered buildings gradually without significant modification of geometry, while preserving the map details as much as possible under cartographic constraints. The amalgamation process consists of three key steps. First, individual buildings are grouped into distinct clusters by using the graph-based spatial clustering application with random forest (GSCARF method. Second, building clusters are decomposed into scaling subgroups according to homogeneity with regard to the mean distance of subgroups. Thus, hierarchies of building clusters can be derived based on scaling subgroups. Finally, an amalgamation operation is progressively performed from the bottom-level subgroups to the top-level subgroups using the maximum distance of each subgroup as the amalgamating tolerance instead of using a fixed tolerance. As a consequence of this step, generalized intermediate scaling results are available, which can form the multi-scale representation of buildings. The experimental results show that the proposed method can generate amalgams with correct details, statistical area balance and orthogonal shape while satisfying cartographic constraints (e.g., minimum distance and minimum area.

  6. A model-based clustering method to detect infectious disease transmission outbreaks from sequence variation.

    Directory of Open Access Journals (Sweden)

    Rosemary M McCloskey

    2017-11-01

    Full Text Available Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis-where individuals are sampled sooner post-infection-rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP, which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85% and specificity (91% than the nonparametric methods. When we applied these clustering methods to published sequences from a study of HIV-1 genetic clusters in Seattle, USA, we found that the MMPP method categorized about half (46% as many individuals to clusters compared to the other methods. Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1,000 tips and about 20 minutes for 50,000 tips on a single computer. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where

  7. Cognitive Clusters in Specific Learning Disorder.

    Science.gov (United States)

    Poletti, Michele; Carretta, Elisa; Bonvicini, Laura; Giorgi-Rossi, Paolo

    The heterogeneity among children with learning disabilities still represents a barrier and a challenge in their conceptualization. Although a dimensional approach has been gaining support, the categorical approach is still the most adopted, as in the recent fifth edition of the Diagnostic and Statistical Manual of Mental Disorders. The introduction of the single overarching diagnostic category of specific learning disorder (SLD) could underemphasize interindividual clinical differences regarding intracategory cognitive functioning and learning proficiency, according to current models of multiple cognitive deficits at the basis of neurodevelopmental disorders. The characterization of specific cognitive profiles associated with an already manifest SLD could help identify possible early cognitive markers of SLD risk and distinct trajectories of atypical cognitive development leading to SLD. In this perspective, we applied a cluster analysis to identify groups of children with a Diagnostic and Statistical Manual-based diagnosis of SLD with similar cognitive profiles and to describe the association between clusters and SLD subtypes. A sample of 205 children with a diagnosis of SLD were enrolled. Cluster analyses (agglomerative hierarchical and nonhierarchical iterative clustering technique) were used successively on 10 core subtests of the Wechsler Intelligence Scale for Children-Fourth Edition. The 4-cluster solution was adopted, and external validation found differences in terms of SLD subtype frequencies and learning proficiency among clusters. Clinical implications of these findings are discussed, tracing directions for further studies.

  8. Improved Density Based Spatial Clustering of Applications of Noise Clustering Algorithm for Knowledge Discovery in Spatial Data

    Directory of Open Access Journals (Sweden)

    Arvind Sharma

    2016-01-01

    Full Text Available There are many techniques available in the field of data mining and its subfield spatial data mining is to understand relationships between data objects. Data objects related with spatial features are called spatial databases. These relationships can be used for prediction and trend detection between spatial and nonspatial objects for social and scientific reasons. A huge data set may be collected from different sources as satellite images, X-rays, medical images, traffic cameras, and GIS system. To handle this large amount of data and set relationship between them in a certain manner with certain results is our primary purpose of this paper. This paper gives a complete process to understand how spatial data is different from other kinds of data sets and how it is refined to apply to get useful results and set trends to predict geographic information system and spatial data mining process. In this paper a new improved algorithm for clustering is designed because role of clustering is very indispensable in spatial data mining process. Clustering methods are useful in various fields of human life such as GIS (Geographic Information System, GPS (Global Positioning System, weather forecasting, air traffic controller, water treatment, area selection, cost estimation, planning of rural and urban areas, remote sensing, and VLSI designing. This paper presents study of various clustering methods and algorithms and an improved algorithm of DBSCAN as IDBSCAN (Improved Density Based Spatial Clustering of Application of Noise. The algorithm is designed by addition of some important attributes which are responsible for generation of better clusters from existing data sets in comparison of other methods.

  9. Energy Aware Cluster Based Routing Scheme For Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    Roy Sohini

    2015-09-01

    Full Text Available Wireless Sensor Network (WSN has emerged as an important supplement to the modern wireless communication systems due to its wide range of applications. The recent researches are facing the various challenges of the sensor network more gracefully. However, energy efficiency has still remained a matter of concern for the researches. Meeting the countless security needs, timely data delivery and taking a quick action, efficient route selection and multi-path routing etc. can only be achieved at the cost of energy. Hierarchical routing is more useful in this regard. The proposed algorithm Energy Aware Cluster Based Routing Scheme (EACBRS aims at conserving energy with the help of hierarchical routing by calculating the optimum number of cluster heads for the network, selecting energy-efficient route to the sink and by offering congestion control. Simulation results prove that EACBRS performs better than existing hierarchical routing algorithms like Distributed Energy-Efficient Clustering (DEEC algorithm for heterogeneous wireless sensor networks and Energy Efficient Heterogeneous Clustered scheme for Wireless Sensor Network (EEHC.

  10. Hessian regularization based non-negative matrix factorization for gene expression data clustering.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Wang, Congzhi

    2015-01-01

    Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.

  11. A new collaborative recommendation approach based on users clustering using artificial bee colony algorithm.

    Science.gov (United States)

    Ju, Chunhua; Xu, Chonghuan

    2013-01-01

    Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods.

  12. A New Collaborative Recommendation Approach Based on Users Clustering Using Artificial Bee Colony Algorithm

    Directory of Open Access Journals (Sweden)

    Chunhua Ju

    2013-01-01

    Full Text Available Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users’ preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods.

  13. Heterologous reconstitution of the intact geodin gene cluster in Aspergillus nidulans through a simple and versatile PCR based approach.

    Directory of Open Access Journals (Sweden)

    Morten Thrane Nielsen

    Full Text Available Fungal natural products are a rich resource for bioactive molecules. To fully exploit this potential it is necessary to link genes to metabolites. Genetic information for numerous putative biosynthetic pathways has become available in recent years through genome sequencing. However, the lack of solid methodology for genetic manipulation of most species severely hampers pathway characterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to transformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC encodes a polyketide synthase, ATEG_08453 (gedR encodes a transcription factor responsible for activation of the geodin gene cluster and ATEG_08460 (gedL encodes a halogenase that catalyzes conversion of sulochrin to dihydrogeodin. We expect that our approach for transferring intact biosynthetic pathways to a fungus with a well developed genetic toolbox will be instrumental in characterizing the many exciting pathways for secondary metabolite production that are currently being uncovered by the fungal genome sequencing projects.

  14. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin

    2014-04-01

    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  15. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    Science.gov (United States)

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  16. Insight into acid-base nucleation experiments by comparison of the chemical composition of positive, negative, and neutral clusters.

    Science.gov (United States)

    Bianchi, Federico; Praplan, Arnaud P; Sarnela, Nina; Dommen, Josef; Kürten, Andreas; Ortega, Ismael K; Schobesberger, Siegfried; Junninen, Heikki; Simon, Mario; Tröstl, Jasmin; Jokinen, Tuija; Sipilä, Mikko; Adamov, Alexey; Amorim, Antonio; Almeida, Joao; Breitenlechner, Martin; Duplissy, Jonathan; Ehrhart, Sebastian; Flagan, Richard C; Franchin, Alessandro; Hakala, Jani; Hansel, Armin; Heinritzi, Martin; Kangasluoma, Juha; Keskinen, Helmi; Kim, Jaeseok; Kirkby, Jasper; Laaksonen, Ari; Lawler, Michael J; Lehtipalo, Katrianne; Leiminger, Markus; Makhmutov, Vladimir; Mathot, Serge; Onnela, Antti; Petäjä, Tuukka; Riccobono, Francesco; Rissanen, Matti P; Rondo, Linda; Tomé, António; Virtanen, Annele; Viisanen, Yrjö; Williamson, Christina; Wimmer, Daniela; Winkler, Paul M; Ye, Penglin; Curtius, Joachim; Kulmala, Markku; Worsnop, Douglas R; Donahue, Neil M; Baltensperger, Urs

    2014-12-02

    We investigated the nucleation of sulfuric acid together with two bases (ammonia and dimethylamine), at the CLOUD chamber at CERN. The chemical composition of positive, negative, and neutral clusters was studied using three Atmospheric Pressure interface-Time Of Flight (APi-TOF) mass spectrometers: two were operated in positive and negative mode to detect the chamber ions, while the third was equipped with a nitrate ion chemical ionization source allowing detection of neutral clusters. Taking into account the possible fragmentation that can happen during the charging of the ions or within the first stage of the mass spectrometer, the cluster formation proceeded via essentially one-to-one acid-base addition for all of the clusters, independent of the type of the base. For the positive clusters, the charge is carried by one excess protonated base, while for the negative clusters it is carried by a deprotonated acid; the same is true for the neutral clusters after these have been ionized. During the experiments involving sulfuric acid and dimethylamine, it was possible to study the appearance time for all the clusters (positive, negative, and neutral). It appeared that, after the formation of the clusters containing three molecules of sulfuric acid, the clusters grow at a similar speed, independent of their charge. The growth rate is then probably limited by the arrival rate of sulfuric acid or cluster-cluster collision.

  17. Identification among morphologically similar Argyreia (Convolvulaceae) based on leaf anatomy and phenetic analyses.

    Science.gov (United States)

    Traiperm, Paweena; Chow, Janene; Nopun, Possathorn; Staples, G; Swangpol, Sasivimon C

    2017-12-01

    The genus Argyreia Lour. is one of the species-rich Asian genera in the family Convolvulaceae. Several species complexes were recognized in which taxon delimitation was imprecise, especially when examining herbarium materials without fully developed open flowers. The main goal of this study is to investigate and describe leaf anatomy for some morphologically similar Argyreia using epidermal peeling, leaf and petiole transverse sections, and scanning electron microscopy. Phenetic analyses including cluster analysis and principal component analysis were used to investigate the similarity of these morpho-types. Anatomical differences observed between the morpho-types include epidermal cell walls and the trichome types on the leaf epidermis. Additional differences in the leaf and petiole transverse sections include the epidermal cell shape of the adaxial leaf blade, the leaf margins, and the petiole transverse sectional outline. The phenogram from cluster analysis using the UPGMA method represented four groups with an R value of 0.87. Moreover, the important quantitative and qualitative leaf anatomical traits of the four groups were confirmed by the principal component analysis of the first two components. The results from phenetic analyses confirmed the anatomical differentiation between the morpho-types. Leaf anatomical features regarded as particularly informative for morpho-type differentiation can be used to supplement macro morphological identification.

  18. Cluster fusion algorithm: application to Lennard-Jones clusters

    DEFF Research Database (Denmark)

    Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

    2006-01-01

    paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...

  19. Cluster fusion algorithm: application to Lennard-Jones clusters

    DEFF Research Database (Denmark)

    Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

    2008-01-01

    paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...

  20. Multiobjective optimization of the inspection intervals of a nuclear safety system: A clustering-based framework for reducing the Pareto Front

    International Nuclear Information System (INIS)

    Zio, E.; Bazzo, R.

    2010-01-01

    In this paper, a framework is developed for identifying a limited number of representative solutions of a multiobjective optimization problem concerning the inspection intervals of the components of a safety system of a nuclear power plant. Pareto Front solutions are first clustered into 'families', which are then synthetically represented by a 'head of the family' solution. Three clustering methods are analyzed. Level Diagrams are then used to represent, analyse and interpret the Pareto Fronts reduced to their head-of-the-family solutions. Two decision situations are considered: without or with decision maker preferences, the latter implying the introduction of a scoring system to rank the solutions with respect to the different objectives: a fuzzy preference assignment is then employed to this purpose. The results of the application of the framework of analysis to the problem of optimizing the inspection intervals of a nuclear power plant safety system show that the clustering-based reduction maintains the Pareto Front shape and relevant characteristics, while making it easier for the decision maker to select the final solution.

  1. Design and implementation of streaming media server cluster based on FFMpeg.

    Science.gov (United States)

    Zhao, Hong; Zhou, Chun-long; Jin, Bao-zhao

    2015-01-01

    Poor performance and network congestion are commonly observed in the streaming media single server system. This paper proposes a scheme to construct a streaming media server cluster system based on FFMpeg. In this scheme, different users are distributed to different servers according to their locations and the balance among servers is maintained by the dynamic load-balancing algorithm based on active feedback. Furthermore, a service redirection algorithm is proposed to improve the transmission efficiency of streaming media data. The experiment results show that the server cluster system has significantly alleviated the network congestion and improved the performance in comparison with the single server system.

  2. Design and Implementation of Streaming Media Server Cluster Based on FFMpeg

    Science.gov (United States)

    Zhao, Hong; Zhou, Chun-long; Jin, Bao-zhao

    2015-01-01

    Poor performance and network congestion are commonly observed in the streaming media single server system. This paper proposes a scheme to construct a streaming media server cluster system based on FFMpeg. In this scheme, different users are distributed to different servers according to their locations and the balance among servers is maintained by the dynamic load-balancing algorithm based on active feedback. Furthermore, a service redirection algorithm is proposed to improve the transmission efficiency of streaming media data. The experiment results show that the server cluster system has significantly alleviated the network congestion and improved the performance in comparison with the single server system. PMID:25734187

  3. DIRECTIONAL OPPORTUNISTIC MECHANISM IN CLUSTER MESSAGE CRITICALITY LEVEL BASED ZIGBEE ROUTING

    OpenAIRE

    B.Rajeshkanna *1, Dr.M.Anitha 2

    2018-01-01

    The cluster message criticality level based zigbee routing(CMCLZOR) has been proposed for routing the cluster messages in wireless smart energy home area networks. It employs zigbee opportunistic shortcut tree routing(ZOSTR) and AODV individually for routing normal messages and highly critical messages respectively. ZOSTR allows the receiving nodes to compete for forwarding a packet with the priority of left-over hops rather than stating single next hop node like unicast protocols. Since it h...

  4. Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Sen Zhang

    2015-01-01

    Full Text Available One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO, inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving clustering problems. In this study, first the PGWO algorithm is tested on seven benchmark functions. Second, the PGWO algorithm is used for data clustering on nine data sets. Compared to other state-of-the-art evolutionary algorithms, the results of benchmark and data clustering demonstrate the superior performance of PGWO algorithm.

  5. A hybrid clustering approach to recognition of protein families in 114 microbial genomes

    Directory of Open Access Journals (Sweden)

    Gogarten J Peter

    2004-04-01

    Full Text Available Abstract Background Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function. Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. Results We describe a hybrid approach to sequence-based clustering of proteins that combines the advantages of standard and Markov clustering. We have implemented this hybrid approach over a relational database environment, and describe its application to clustering a large subset of PDB, and to 328577 proteins from 114 fully sequenced microbial genomes. To demonstrate utility with difficult problems, we show that hybrid clustering allows us to constitute the paralogous family of ATP synthase F1 rotary motor subunits into a single, biologically interpretable hierarchical grouping that was not accessible using either single-linkage or Markov clustering alone. We describe validation of this method by hybrid clustering of PDB and mapping SCOP families and domains onto the resulting clusters. Conclusion Hybrid (Markov followed by single-linkage clustering combines the advantages of the Markov Cluster algorithm (avoidance of non-specific clusters resulting from matches to promiscuous domains and single-linkage clustering (preservation of topological information as a function of threshold. Within the individual Markov clusters, single-linkage clustering is a more-precise instrument, discerning sub-clusters of biological relevance. Our hybrid approach thus provides a computationally efficient

  6. Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery.

    Science.gov (United States)

    Perualila-Tan, Nolen Joy; Shkedy, Ziv; Talloen, Willem; Göhlmann, Hinrich W H; Moerbeke, Marijke Van; Kasim, Adetayo

    2016-08-01

    The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.

  7. Conveyor Performance based on Motor DC 12 Volt Eg-530ad-2f using K-Means Clustering

    Science.gov (United States)

    Arifin, Zaenal; Artini, Sri DP; Much Ibnu Subroto, Imam

    2017-04-01

    To produce goods in industry, a controlled tool to improve production is required. Separation process has become a part of production process. Separation process is carried out based on certain criteria to get optimum result. By knowing the characteristics performance of a controlled tools in separation process the optimum results is also possible to be obtained. Clustering analysis is popular method for clustering data into smaller segments. Clustering analysis is useful to divide a group of object into a k-group in which the member value of the group is homogeny or similar. Similarity in the group is set based on certain criteria. The work in this paper based on K-Means method to conduct clustering of loading in the performance of a conveyor driven by a dc motor 12 volt eg-530-2f. This technique gives a complete clustering data for a prototype of conveyor driven by dc motor to separate goods in term of height. The parameters involved are voltage, current, time of travelling. These parameters give two clusters namely optimal cluster with center of cluster 10.50 volt, 0.3 Ampere, 10.58 second, and unoptimal cluster with center of cluster 10.88 volt, 0.28 Ampere and 40.43 second.

  8. Properties of an ionised-cluster beam from a vaporised-cluster ion source

    International Nuclear Information System (INIS)

    Takagi, T.; Yamada, I.; Sasaki, A.

    1978-01-01

    A new type of ion source vaporised-metal cluster ion source, has been developed for deposition and epitaxy. A cluster consisting of 10 2 to 10 3 atoms coupled loosely together is formed by adiabatic expansion ejecting the vapour of materials into a high-vacuum region through the nozzle of a heated crucible. The clusters are ionised by electron bombardment and accelerated with neutral clusters toward a substrate. In this paper, mechanisms of cluster formation experimental results of the cluster size (atoms/cluster) and its distribution, and characteristics of the cluster ion beams are reported. The size is calculated from the kinetic equation E = (1/2)mNVsub(ej) 2 , where E is the cluster beam energy, Vsub(ej) is the ejection velocity, m is the mass of atom and N is the cluster size. The energy and the velocity of the cluster are measured by an electrostatic 127 0 energy analyser and a rotating disc system, respectively. The cluster size obtained for Ag is about 5 x 10 2 to 2 x 10 3 atoms. The retarding potential method is used to confirm the results for Ag. The same dependence on cluster size for metals such as Ag, Cu and Pb has been obtained in previous experiments. In the cluster state the cluster ion beam is easily produced by electron bombardment. About 50% of ionised clusters are obtained under typical operation conditions, because of the large ionisation cross sections of the clusters. To obtain a uniform spatial distribution, the ionising electrode system is also discussed. The new techniques are termed ionised-cluster beam deposition (ICBD) and epitaxy (ICBE). (author)

  9. Time-dependent risks of cancer clustering among couples: a nationwide population-based cohort study in Taiwan.

    Science.gov (United States)

    Wang, Jong-Yi; Liang, Yia-Wen; Yeh, Chun-Chen; Liu, Chiu-Shong; Wang, Chen-Yu

    2018-02-21

    Spousal clustering of cancer warrants attention. Whether the common environment or high-age vulnerability determines cancer clustering is unclear. The risk of clustering in couples versus non-couples is undetermined. The time to cancer clustering after the first cancer diagnosis is yet to be reported. This study investigated cancer clustering over time among couples by using nationwide data. A cohort of 5643 married couples in the 2002-2013 Taiwan National Health Insurance Research Database was identified and randomly matched with 5643 non-couple pairs through dual propensity score matching. Factors associated with clustering (both spouses with tumours) were analysed by using the Cox proportional hazard model. Propensity-matched analysis revealed that the risk of clustering of all tumours among couples (13.70%) was significantly higher than that among non-couples (11.84%) (OR=1.182, 95% CI 1.058 to 1.321, P=0.0031). The median time to clustering of all tumours and of malignant tumours was 2.92 and 2.32 years, respectively. Risk characteristics associated with clustering included high age and comorbidity. Shared environmental factors among spouses might be linked to a high incidence of cancer clustering. Cancer incidence in one spouse may signal cancer vulnerability in the other spouse. Promoting family-oriented cancer care in vulnerable families and preventing shared lifestyle risk factors for cancer are suggested. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  10. How do childhood diagnoses of type 1 diabetes cluster in time?

    Directory of Open Access Journals (Sweden)

    Colin R Muirhead

    Full Text Available BACKGROUND: Previous studies have indicated that type 1 diabetes may have an infectious origin. The presence of temporal clustering-an irregular temporal distribution of cases--would provide additional evidence that occurrence may be linked with an agent that displays epidemicity. We tested for the presence and form of temporal clustering using population- based data from northeast England. MATERIALS AND METHODS: The study analysed data on children aged 0-14 years diagnosed with type 1 diabetes during the period 1990-2007 and resident in a defined geographical region of northeast England (Northumberland, Newcastle upon Tyne, and North Tyneside. Tests for temporal clustering by time of diagnosis were applied using a modified version of the Potthoff-Whittinghill method. RESULTS: The study analysed 468 cases of children diagnosed with type 1 diabetes. There was highly statistically significant evidence of temporal clustering over periods of a few months and over longer time intervals (p<0.001. The clustering within years did not show a consistent seasonal pattern. CONCLUSIONS: The study adds to the growing body of literature that supports the involvement of infectious agents in the aetiology of type 1 diabetes in children. Specifically it suggests that the precipitating agent or agents involved might be an infection that occurs in "mini-epidemics".

  11. Properties of ammonium ion-water clusters: analyses of structure evolution, noncovalent interactions, and temperature and humidity effects.

    Science.gov (United States)

    Pei, Shi-Tu; Jiang, Shuai; Liu, Yi-Rong; Huang, Teng; Xu, Kang-Ming; Wen, Hui; Zhu, Yu-Peng; Huang, Wei

    2015-03-26

    Although ammonium ion-water clusters are abundant in the biosphere, some information regarding these clusters, such as their growth route, the influence of temperature and humidity, and the concentrations of various hydrated clusters, is lacking. In this study, theoretical calculations are performed on ammonium ion-water clusters. These theoretical calculations are focused on determining the following characteristics: (1) the pattern of cluster growth; (2) the percentages of clusters of the same size at different temperatures and humidities; (3) the distributions of different isomers for the same size clusters at different temperatures; (4) the relative strengths of the noncovalent interactions for clusters of different sizes. The results suggest that the dipole moment may be very significant for the ammonium ion-water system, and some new stable isomers were found. The nucleation of ammonium ions and water molecules is favorable at low temperatures; thus, the clusters observed at high altitudes might not be present at low altitudes. High humidity can contribute to the formation of large ammonium ion-water clusters, whereas the formation of small clusters may be favorable under low-humidity conditions. The potential energy surfaces (PES) of these different sized clusters are complicated and differ according to the distribution of isomers at different temperatures. Some similar structures are observed between NH4(+)(H2O)n and M(H2O)n (where M represents an alkali metal ion or water molecule); when n = 8, the clusters begin to form the closed-cage geometry. As the cluster size increases, these interactions become progressively weaker. The successive binding energy at the DF-MP2-F12/VDZ-F12 level is better than that at the PW91PW91/6-311++G(3df, 3pd) level and is consistent with the experimentally determined values.

  12. A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.

    Science.gov (United States)

    Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei

    2012-01-01

    DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.

  13. Time clustered sampling can inflate the inferred substitution rate in foot-and-mouth disease virus analyses

    DEFF Research Database (Denmark)

    Pedersen, Casper-Emil Tingskov; Frandsen, Peter; Wekesa, Sabenzia N.

    2015-01-01

    abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale...... through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer...... to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully...

  14. Cluster Matters

    DEFF Research Database (Denmark)

    Gulati, Mukesh; Lund-Thomsen, Peter; Suresh, Sangeetha

    2018-01-01

    sell their products successfully in international markets, but there is also an increasingly large consumer base within India. Indeed, Indian industrial clusters have contributed to a substantial part of this growth process, and there are several hundred registered clusters within the country...... of this handbook, which focuses on the role of CSR in MSMEs. Hence we contribute to the literature on CSR in industrial clusters and specifically CSR in Indian industrial clusters by investigating the drivers of CSR in India’s industrial clusters....

  15. 3.5D dynamic PET image reconstruction incorporating kinetics-based clusters

    International Nuclear Information System (INIS)

    Lu Lijun; Chen Wufan; Karakatsanis, Nicolas A; Rahmim, Arman; Tang Jing

    2012-01-01

    Standard 3D dynamic positron emission tomographic (PET) imaging consists of independent image reconstructions of individual frames followed by application of appropriate kinetic model to the time activity curves at the voxel or region-of-interest (ROI). The emerging field of 4D PET reconstruction, by contrast, seeks to move beyond this scheme and incorporate information from multiple frames within the image reconstruction task. Here we propose a novel reconstruction framework aiming to enhance quantitative accuracy of parametric images via introduction of priors based on voxel kinetics, as generated via clustering of preliminary reconstructed dynamic images to define clustered neighborhoods of voxels with similar kinetics. This is then followed by straightforward maximum a posteriori (MAP) 3D PET reconstruction as applied to individual frames; and as such the method is labeled ‘3.5D’ image reconstruction. The use of cluster-based priors has the advantage of further enhancing quantitative performance in dynamic PET imaging, because: (a) there are typically more voxels in clusters than in conventional local neighborhoods, and (b) neighboring voxels with distinct kinetics are less likely to be clustered together. Using realistic simulated 11 C-raclopride dynamic PET data, the quantitative performance of the proposed method was investigated. Parametric distribution-volume (DV) and DV ratio (DVR) images were estimated from dynamic image reconstructions using (a) maximum-likelihood expectation maximization (MLEM), and MAP reconstructions using (b) the quadratic prior (QP-MAP), (c) the Green prior (GP-MAP) and (d, e) two proposed cluster-based priors (CP-U-MAP and CP-W-MAP), followed by graphical modeling, and were qualitatively and quantitatively compared for 11 ROIs. Overall, the proposed dynamic PET reconstruction methodology resulted in substantial visual as well as quantitative accuracy improvements (in terms of noise versus bias performance) for parametric DV

  16. Analytical network process based optimum cluster head selection in wireless sensor network.

    Science.gov (United States)

    Farman, Haleem; Javed, Huma; Jan, Bilal; Ahmad, Jamil; Ali, Shaukat; Khalil, Falak Naz; Khan, Murad

    2017-01-01

    Wireless Sensor Networks (WSNs) are becoming ubiquitous in everyday life due to their applications in weather forecasting, surveillance, implantable sensors for health monitoring and other plethora of applications. WSN is equipped with hundreds and thousands of small sensor nodes. As the size of a sensor node decreases, critical issues such as limited energy, computation time and limited memory become even more highlighted. In such a case, network lifetime mainly depends on efficient use of available resources. Organizing nearby nodes into clusters make it convenient to efficiently manage each cluster as well as the overall network. In this paper, we extend our previous work of grid-based hybrid network deployment approach, in which merge and split technique has been proposed to construct network topology. Constructing topology through our proposed technique, in this paper we have used analytical network process (ANP) model for cluster head selection in WSN. Five distinct parameters: distance from nodes (DistNode), residual energy level (REL), distance from centroid (DistCent), number of times the node has been selected as cluster head (TCH) and merged node (MN) are considered for CH selection. The problem of CH selection based on these parameters is tackled as a multi criteria decision system, for which ANP method is used for optimum cluster head selection. Main contribution of this work is to check the applicability of ANP model for cluster head selection in WSN. In addition, sensitivity analysis is carried out to check the stability of alternatives (available candidate nodes) and their ranking for different scenarios. The simulation results show that the proposed method outperforms existing energy efficient clustering protocols in terms of optimum CH selection and minimizing CH reselection process that results in extending overall network lifetime. This paper analyzes that ANP method used for CH selection with better understanding of the dependencies of

  17. Evidence-based case selection: An innovative knowledge management method to cluster public technical and vocational education and training colleges in South Africa

    Directory of Open Access Journals (Sweden)

    Margaretha M. Visser

    2017-03-01

    Full Text Available Background: Case studies are core constructs used in information management research. A persistent challenge for business, information management and social science researchers is how to select a representative sample of cases among a population with diverse characteristics when convenient or purposive sampling is not considered rigorous enough. The context of the study is post-school education, and it involves an investigation of quantitative methods of clustering the population of public technical and vocational education and training (TVET colleges in South Africa into groups with a similar level of maturity in terms of their information systems. Objectives: The aim of the study was to propose an evidence-based quantitative method for the selection of cases for case study research and to demonstrate the use and usefulness thereof by clustering public TVET colleges. Method: The clustering method was based on the use of a representative characteristic of the context, as a proxy. In this context of management information systems (MISs, website maturity was used as a proxy and website maturity model theory was used in the development of an evaluation questionnaire. The questionnaire was used for capturing data on website characteristics, which was used to determine website maturity. The websites of the 50 public TVET colleges were evaluated by nine evaluators. Multiple statistical techniques were applied to establish inter-rater reliability and to produce clusters of colleges. Results: The analyses revealed three clusters of public TVET colleges based on their website maturity levels. The first cluster includes three colleges with no websites or websites at a low maturity level. The second cluster consists of 30 colleges with websites at an average maturity level. The third cluster contains 17 colleges with websites at a high maturity level. Conclusion: The main contribution to the knowledge domain is an innovative quantitative method employing a

  18. STRATEGIES FOR DEVELOPING SUSTAINABLE AND COMPETITIVE CLUSTER FOR SHRIMP INDUSTRY

    Directory of Open Access Journals (Sweden)

    Anas M. Fauzi

    2012-09-01

    Full Text Available Kampung Vannamei as shrimp cluster is being developed since 2004 by PT CP Prima, tbk Surabaya through Shrimp Culture Health Management transformation technology to several traditional farmers in Gresik, Lamongan, Tuban, and Madura areas. The research objectives aims to identify and mapping of stakeholder, to analyze interaction of stakeholders, to formulate strategy from internal and external environment factors and to set priority on strategy to develop sustainable and competitive shrimp cluster in the Kampung vannamei. Primary data was collected through stakeholders’ discussion forums, questionnaires, and interviews with relevant actors. Observations to the business unit also performed to determine the production and business conditions, particularly in capturing information about the threat and challenges. While the secondary data is used in policy documents national and local area statistics, and relevant literature. Analyses were performed by using the SRI International cluster pyramid, diamond porter’s analysis, SWOT and Matrix TOWS analysis, and analytical hierarchy process. Analyses were performed by the methods discussed in qualitative and descriptive. There are 7 strategies could be implemented to develop sustainable and competitive shrimp cluster. However, it is recommended to implement the strategy base on priority, which the first priority is strategy to improve linkages between businesses in the upstream and downstream industries into multi stakeholders’ platform in shrimp industry.Keywords: Shrimp, Cluster, Competitiveness, Diamond Porter, SWOT Analysis, AHP

  19. Energy Threshold-based Cluster Head Rotation for Routing Protocol in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Hadi Raheem Ali

    2018-05-01

    Full Text Available Energy efficiency represents a fundamental issue in WSNs, since the network lifetime period entirely depends on the energy of sensor nodes, which are usually battery-operated. In this article, an unequal clustering-based routing protocol has been suggested, where parameters of energy, distance, and density are involved in the cluster head election. Besides, the sizes of clusters are unequal according to distance, energy, and density. Furthermore, the cluster heads are not changed every round unless the residual energy reaches a specific threshold of energy. The outcomes of the conducted simulation confirmed that the performance of the suggested protocol achieves improvement in energy efficiency.

  20. Diversity among galaxy clusters

    International Nuclear Information System (INIS)

    Struble, M.F.; Rood, H.J.

    1988-01-01

    The classification of galaxy clusters is discussed. Consideration is given to the classification scheme of Abell (1950's), Zwicky (1950's), Morgan, Matthews, and Schmidt (1964), and Morgan-Bautz (1970). Galaxies can be classified based on morphology, chemical composition, spatial distribution, and motion. The correlation between a galaxy's environment and morphology is examined. The classification scheme of Rood-Sastry (1971), which is based on clusters's morphology and galaxy population, is described. The six types of clusters they define include: (1) a cD-cluster dominated by a single large galaxy, (2) a cluster dominated by a binary, (3) a core-halo cluster, (4) a cluster dominated by several bright galaxies, (5) a cluster appearing flattened, and (6) an irregularly shaped cluster. Attention is also given to the evolution of cluster structures, which is related to initial density and cluster motion

  1. The cluster analysis based on non-teacher artificial neural network for the danger prediction of coal spontaneous fire

    Energy Technology Data Exchange (ETDEWEB)

    Wang, D.; Wang, J. [China University of Mining and Technology (China)

    1999-04-01

    This paper focuses on the problem of predicting the danger level of spontaneous fire in coal mines. Firstly, the inadequacy of the present artificial neural networks prediction model is analysed. Then a new cluster model based on non-teacher neural network is constructed according to the danger judgement standards given by experts. On this basis, by adopting the error square sum criterion and its algorithm, the corresponding prediction software is developed and applied in two working faces of Chaili Coal Mine. The forecasting result is importantly significant for the prevention of spontaneous fire. 4 refs., 1 fig., 1 tab.

  2. Risk Assessment for Bridges Safety Management during Operation Based on Fuzzy Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    Xia Hanyu

    2016-01-01

    Full Text Available In recent years, large span and large sea-crossing bridges are built, bridges accidents caused by improper operational management occur frequently. In order to explore the better methods for risk assessment of the bridges operation departments, the method based on fuzzy clustering algorithm is selected. Then, the implementation steps of fuzzy clustering algorithm are described, the risk evaluation system is built, and Taizhou Bridge is selected as an example, the quantitation of risk factors is described. After that, the clustering algorithm based on fuzzy equivalence is calculated on MATLAB 2010a. In the last, Taizhou Bridge operation management departments are classified and sorted according to the degree of risk, and the safety situation of operation departments is analyzed.

  3. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  4. Preconditions for Emergence of Lithuanian Clusters: from Informal Cooperation to Its Legitimation

    Directory of Open Access Journals (Sweden)

    Grumadaitė Kristina

    2017-06-01

    Full Text Available This paper reveals preconditions for the emergence of clusters as self-organisation based industrial systems in a context, in which cooperation traditions are insufficiently developed. These preconditions reflect the principles of the emergence of self-organising complex adaptive systems that are analysed in the complexity theory. Those principles are based on the initiation of non-equilibrium and its purposeful direction into the creation of a new order. This paper highlights the main external and internal tensions that influence informal or formal clustering of enterprises, while various change agents perform different roles making self-organising processes to occur.

  5. Lattice and Valence Electronic Structures of Crystalline Octahedral Molybdenum Halide Clusters-Based Compounds, Cs2[Mo6X14] (X = Cl, Br, I), Studied by Density Functional Theory Calculations.

    Science.gov (United States)

    Saito, Norio; Cordier, Stéphane; Lemoine, Pierric; Ohsawa, Takeo; Wada, Yoshiki; Grasset, Fabien; Cross, Jeffrey S; Ohashi, Naoki

    2017-06-05

    The electronic and crystal structures of Cs 2 [Mo 6 X 14 ] (X = Cl, Br, I) cluster-based compounds were investigated by density functional theory (DFT) simulations and experimental methods such as powder X-ray diffraction, ultraviolet-visible spectroscopy, and X-ray photoemission spectroscopy (XPS). The experimentally determined lattice parameters were in good agreement with theoretically optimized ones, indicating the usefulness of DFT calculations for the structural investigation of these clusters. The calculated band gaps of these compounds reproduced those experimentally determined by UV-vis reflectance within an error of a few tenths of an eV. Core-level XPS and effective charge analyses indicated bonding states of the halogens changed according to their sites. The XPS valence spectra were fairly well reproduced by simulations based on the projected electron density of states weighted with cross sections of Al K α , suggesting that DFT calculations can predict the electronic properties of metal-cluster-based crystals with good accuracy.

  6. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    Science.gov (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  7. Improving clustering with metabolic pathway data.

    Science.gov (United States)

    Milone, Diego H; Stegmayer, Georgina; López, Mariana; Kamenetzky, Laura; Carrari, Fernando

    2014-04-10

    It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters. A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view. Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.The algorithm is available as a

  8. 75 FR 53667 - Space Coast Regional Innovation Cluster Competition

    Science.gov (United States)

    2010-09-01

    ... Coast Regional Innovation Cluster Competition AGENCY: Economic Development Administration (EDA... upcoming availability of funding for the Space Coast Regional Innovation Cluster (RIC) Competition under... economic development initiatives aligned with regional cluster and competitiveness analyses to sustain the...

  9. Medical Imaging Lesion Detection Based on Unified Gravitational Fuzzy Clustering

    Directory of Open Access Journals (Sweden)

    Jean Marie Vianney Kinani

    2017-01-01

    Full Text Available We develop a swift, robust, and practical tool for detecting brain lesions with minimal user intervention to assist clinicians and researchers in the diagnosis process, radiosurgery planning, and assessment of the patient’s response to the therapy. We propose a unified gravitational fuzzy clustering-based segmentation algorithm, which integrates the Newtonian concept of gravity into fuzzy clustering. We first perform fuzzy rule-based image enhancement on our database which is comprised of T1/T2 weighted magnetic resonance (MR and fluid-attenuated inversion recovery (FLAIR images to facilitate a smoother segmentation. The scalar output obtained is fed into a gravitational fuzzy clustering algorithm, which separates healthy structures from the unhealthy. Finally, the lesion contour is automatically outlined through the initialization-free level set evolution method. An advantage of this lesion detection algorithm is its precision and its simultaneous use of features computed from the intensity properties of the MR scan in a cascading pattern, which makes the computation fast, robust, and self-contained. Furthermore, we validate our algorithm with large-scale experiments using clinical and synthetic brain lesion datasets. As a result, an 84%–93% overlap performance is obtained, with an emphasis on robustness with respect to different and heterogeneous types of lesion and a swift computation time.

  10. Bootstrap-Based Improvements for Inference with Clustered Errors

    OpenAIRE

    Doug Miller; A. Colin Cameron; Jonah B. Gelbach

    2006-01-01

    Microeconometrics researchers have increasingly realized the essential need to account for any within-group dependence in estimating standard errors of regression parameter estimates. The typical preferred solution is to calculate cluster-robust or sandwich standard errors that permit quite general heteroskedasticity and within-cluster error correlation, but presume that the number of clusters is large. In applications with few (5-30) clusters, standard asymptotic tests can over-reject consid...

  11. Ligand cluster-based protein network and ePlatton, a multi-target ligand finder.

    Science.gov (United States)

    Du, Yu; Shi, Tieliu

    2016-01-01

    Small molecules are information carriers that make cells aware of external changes and couple internal metabolic and signalling pathway systems with each other. In some specific physiological status, natural or artificial molecules are used to interact with selective biological targets to activate or inhibit their functions to achieve expected biological and physiological output. Millions of years of evolution have optimized biological processes and pathways and now the endocrine and immune system cannot work properly without some key small molecules. In the past thousands of years, the human race has managed to find many medicines against diseases by trail-and-error experience. In the recent decades, with the deepening understanding of life and the progress of molecular biology, researchers spare no effort to design molecules targeting one or two key enzymes and receptors related to corresponding diseases. But recent studies in pharmacogenomics have shown that polypharmacology may be necessary for the effects of drugs, which challenge the paradigm, 'one drug, one target, one disease'. Nowadays, cheminformatics and structural biology can help us reasonably take advantage of the polypharmacology to design next-generation promiscuous drugs and drug combination therapies. 234,591 protein-ligand interactions were extracted from ChEMBL. By the 2D structure similarity, 13,769 ligand emerged from 156,151 distinct ligands which were recognized by 1477 proteins. Ligand cluster- and sequence-based protein networks (LCBN, SBN) were constructed, compared and analysed. For assisting compound designing, exploring polypharmacology and finding possible drug combination, we integrated the pathway, disease, drug adverse reaction and the relationship of targets and ligand clusters into the web platform, ePlatton, which is available at http://www.megabionet.org/eplatton. Although there were some disagreements between the LCBN and SBN, communities in both networks were largely the same

  12. A Novel Double Cluster and Principal Component Analysis-Based Optimization Method for the Orbit Design of Earth Observation Satellites

    Directory of Open Access Journals (Sweden)

    Yunfeng Dong

    2017-01-01

    Full Text Available The weighted sum and genetic algorithm-based hybrid method (WSGA-based HM, which has been applied to multiobjective orbit optimizations, is negatively influenced by human factors through the artificial choice of the weight coefficients in weighted sum method and the slow convergence of GA. To address these two problems, a cluster and principal component analysis-based optimization method (CPC-based OM is proposed, in which many candidate orbits are gradually randomly generated until the optimal orbit is obtained using a data mining method, that is, cluster analysis based on principal components. Then, the second cluster analysis of the orbital elements is introduced into CPC-based OM to improve the convergence, developing a novel double cluster and principal component analysis-based optimization method (DCPC-based OM. In DCPC-based OM, the cluster analysis based on principal components has the advantage of reducing the human influences, and the cluster analysis based on six orbital elements can reduce the search space to effectively accelerate convergence. The test results from a multiobjective numerical benchmark function and the orbit design results of an Earth observation satellite show that DCPC-based OM converges more efficiently than WSGA-based HM. And DCPC-based OM, to some degree, reduces the influence of human factors presented in WSGA-based HM.

  13. An Improved Semisupervised Outlier Detection Algorithm Based on Adaptive Feature Weighted Clustering

    Directory of Open Access Journals (Sweden)

    Tingquan Deng

    2016-01-01

    Full Text Available There exist already various approaches to outlier detection, in which semisupervised methods achieve encouraging superiority due to the introduction of prior knowledge. In this paper, an adaptive feature weighted clustering-based semisupervised outlier detection strategy is proposed. This method maximizes the membership degree of a labeled normal object to the cluster it belongs to and minimizes the membership degrees of a labeled outlier to all clusters. In consideration of distinct significance of features or components in a dataset in determining an object being an inlier or outlier, each feature is adaptively assigned different weights according to the deviation degrees between this feature of all objects and that of a certain cluster prototype. A series of experiments on a synthetic dataset and several real-world datasets are implemented to verify the effectiveness and efficiency of the proposal.

  14. Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach.

    Science.gov (United States)

    Pasupa, Kitsuchart; Kudisthalert, Wasu

    2018-01-01

    Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets-Maximum Unbiased Validation Dataset-which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6.

  15. Cluster-based Dynamic Energy Management for Collaborative Target Tracking in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Dao-Wei Bi

    2007-07-01

    Full Text Available A primary criterion of wireless sensor network is energy efficiency. Focused onthe energy problem of target tracking in wireless sensor networks, this paper proposes acluster-based dynamic energy management mechanism. Target tracking problem isformulated by the multi-sensor detection model as well as energy consumption model. Adistributed adaptive clustering approach is investigated to form a reasonable routingframework which has uniform cluster head distribution. Dijkstra’s algorithm is utilized toobtain optimal intra-cluster routing. Target position is predicted by particle filter. Thepredicted target position is adopted to estimate the idle interval of sensor nodes. Hence,dynamic awakening approach is exploited to prolong sleep time of sensor nodes so that theoperation energy consumption of wireless sensor network can be reduced. The sensornodes around the target wake up on time and act as sensing candidates. With the candidatesensor nodes and predicted target position, the optimal sensor node selection is considered.Binary particle swarm optimization is proposed to minimize the total energy consumptionduring collaborative sensing and data reporting. Experimental results verify that theproposed clustering approach establishes a low-energy communication structure while theenergy efficiency of wireless sensor networks is enhanced by cluster-based dynamic energymanagement.

  16. Hedgehog bases for A{sub n} cluster polylogarithms and an application to six-point amplitudes

    Energy Technology Data Exchange (ETDEWEB)

    Parker, Daniel E.; Scherlis, Adam; Spradlin, Marcus; Volovich, Anastasia [Department of Physics, Brown University, Providence RI 02912 (United States)

    2015-11-20

    Multi-loop scattering amplitudes in N=4 Yang-Mills theory possess cluster algebra structure. In order to develop a computational framework which exploits this connection, we show how to construct bases of Goncharov polylogarithm functions, at any weight, whose symbol alphabet consists of cluster coordinates on the A{sub n} cluster algebra. Using such a basis we present a new expression for the 2-loop 6-particle NMHV amplitude which makes some of its cluster structure manifest.

  17. Detection of protein complex from protein-protein interaction network using Markov clustering

    International Nuclear Information System (INIS)

    Ochieng, P J; Kusuma, W A; Haryanto, T

    2017-01-01

    Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks. (paper)

  18. DMINDA: an integrated web server for DNA motif identification and analyses.

    Science.gov (United States)

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Trend analysis using non-stationary time series clustering based on the finite element method

    Science.gov (United States)

    Gorji Sefidmazgi, M.; Sayemuzzaman, M.; Homaifar, A.; Jha, M. K.; Liess, S.

    2014-05-01

    In order to analyze low-frequency variability of climate, it is useful to model the climatic time series with multiple linear trends and locate the times of significant changes. In this paper, we have used non-stationary time series clustering to find change points in the trends. Clustering in a multi-dimensional non-stationary time series is challenging, since the problem is mathematically ill-posed. Clustering based on the finite element method (FEM) is one of the methods that can analyze multidimensional time series. One important attribute of this method is that it is not dependent on any statistical assumption and does not need local stationarity in the time series. In this paper, it is shown how the FEM-clustering method can be used to locate change points in the trend of temperature time series from in situ observations. This method is applied to the temperature time series of North Carolina (NC) and the results represent region-specific climate variability despite higher frequency harmonics in climatic time series. Next, we investigated the relationship between the climatic indices with the clusters/trends detected based on this clustering method. It appears that the natural variability of climate change in NC during 1950-2009 can be explained mostly by AMO and solar activity.

  20. Cluster-based control of a separating flow over a smoothly contoured ramp

    Science.gov (United States)

    Kaiser, Eurika; Noack, Bernd R.; Spohn, Andreas; Cattafesta, Louis N.; Morzyński, Marek

    2017-12-01

    The ability to manipulate and control fluid flows is of great importance in many scientific and engineering applications. The proposed closed-loop control framework addresses a key issue of model-based control: The actuation effect often results from slow dynamics of strongly nonlinear interactions which the flow reveals at timescales much longer than the prediction horizon of any model. Hence, we employ a probabilistic approach based on a cluster-based discretization of the Liouville equation for the evolution of the probability distribution. The proposed methodology frames high-dimensional, nonlinear dynamics into low-dimensional, probabilistic, linear dynamics which considerably simplifies the optimal control problem while preserving nonlinear actuation mechanisms. The data-driven approach builds upon a state space discretization using a clustering algorithm which groups kinematically similar flow states into a low number of clusters. The temporal evolution of the probability distribution on this set of clusters is then described by a control-dependent Markov model. This Markov model can be used as predictor for the ergodic probability distribution for a particular control law. This probability distribution approximates the long-term behavior of the original system on which basis the optimal control law is determined. We examine how the approach can be used to improve the open-loop actuation in a separating flow dominated by Kelvin-Helmholtz shedding. For this purpose, the feature space, in which the model is learned, and the admissible control inputs are tailored to strongly oscillatory flows.

  1. An Integrated Intrusion Detection Model of Cluster-Based Wireless Sensor Network.

    Science.gov (United States)

    Sun, Xuemei; Yan, Bo; Zhang, Xinzhong; Rong, Chuitian

    2015-01-01

    Considering wireless sensor network characteristics, this paper combines anomaly and mis-use detection and proposes an integrated detection model of cluster-based wireless sensor network, aiming at enhancing detection rate and reducing false rate. Adaboost algorithm with hierarchical structures is used for anomaly detection of sensor nodes, cluster-head nodes and Sink nodes. Cultural-Algorithm and Artificial-Fish-Swarm-Algorithm optimized Back Propagation is applied to mis-use detection of Sink node. Plenty of simulation demonstrates that this integrated model has a strong performance of intrusion detection.

  2. Clustering cliques for graph-based summarization of the biomedical research literature

    DEFF Research Database (Denmark)

    Zhang, Han; Fiszman, Marcelo; Shin, Dongwook

    2013-01-01

    Background: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts).Results: Sem......Rep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm...

  3. Cluster-Based Adaptation Using Density Forest for HMM Phone Recognition

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Tan, Zheng-Hua; Christensen, Mads Græsbøll

    2014-01-01

    The dissimilarity between the training and test data in speech recognition systems is known to have a considerable effect on the recognition accuracy. To solve this problem, we use density forest to cluster the data and use maximum a posteriori (MAP) method to build a cluster-based adapted Gaussian...... mixture models (GMMs) in HMM speech recognition. Specifically, a set of bagged versions of the training data for each state in the HMM is generated, and each of these versions is used to generate one GMM and one tree in the density forest. Thereafter, an acoustic model forest is built by replacing...... the data of each leaf (cluster) in each tree with the corresponding GMM adapted by the leaf data using the MAP method. The results show that the proposed approach achieves 3:8% (absolute) lower phone error rate compared with the standard HMM/GMM and 0:8% (absolute) lower PER compared with bagged HMM/GMM....

  4. [Optimization of cluster analysis based on drug resistance profiles of MRSA isolates].

    Science.gov (United States)

    Tani, Hiroya; Kishi, Takahiko; Gotoh, Minehiro; Yamagishi, Yuka; Mikamo, Hiroshige

    2015-12-01

    We examined 402 methicillin-resistant Staphylococcus aureus (MRSA) strains isolated from clinical specimens in our hospital between November 19, 2010 and December 27, 2011 to evaluate the similarity between cluster analysis of drug susceptibility tests and pulsed-field gel electrophoresis (PFGE). The results showed that the 402 strains tested were classified into 27 PFGE patterns (151 subtypes of patterns). Cluster analyses of drug susceptibility tests with the cut-off distance yielding a similar classification capability showed favorable results--when the MIC method was used, and minimum inhibitory concentration (MIC) values were used directly in the method, the level of agreement with PFGE was 74.2% when 15 drugs were tested. The Unweighted Pair Group Method with Arithmetic mean (UPGMA) method was effective when the cut-off distance was 16. Using the SIR method in which susceptible (S), intermediate (I), and resistant (R) were coded as 0, 2, and 3, respectively, according to the Clinical and Laboratory Standards Institute (CLSI) criteria, the level of agreement with PFGE was 75.9% when the number of drugs tested was 17, the method used for clustering was the UPGMA, and the cut-off distance was 3.6. In addition, to assess the reproducibility of the results, 10 strains were randomly sampled from the overall test and subjected to cluster analysis. This was repeated 100 times under the same conditions. The results indicated good reproducibility of the results, with the level of agreement with PFGE showing a mean of 82.0%, standard deviation of 12.1%, and mode of 90.0% for the MIC method and a mean of 80.0%, standard deviation of 13.4%, and mode of 90.0% for the SIR method. In summary, cluster analysis for drug susceptibility tests is useful for the epidemiological analysis of MRSA.

  5. Clustering applications in financial and economic analysis of the crop production in the Russian regions

    Directory of Open Access Journals (Sweden)

    Gromov Vladislav Vladimirovich

    2013-08-01

    Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.

  6. Soil data clustering by using K-means and fuzzy K-means algorithm

    Directory of Open Access Journals (Sweden)

    E. Hot

    2016-06-01

    Full Text Available A problem of soil clustering based on the chemical characteristics of soil, and proper visual representation of the obtained results, is analysed in the paper. To that aim, K-means and fuzzy K-means algorithms are adapted for soil data clustering. A database of soil characteristics sampled in Montenegro is used for a comparative analysis of implemented algorithms. The procedure of setting proper values for control parameters of fuzzy K-means is illustrated on the used database. In addition, validation of clustering is made through visualisation. Classified soil data are presented on the static Google map and dynamic Open Street Map.

  7. Multiple-Features-Based Semisupervised Clustering DDoS Detection Method

    Directory of Open Access Journals (Sweden)

    Yonghao Gu

    2017-01-01

    Full Text Available DDoS attack stream from different agent host converged at victim host will become very large, which will lead to system halt or network congestion. Therefore, it is necessary to propose an effective method to detect the DDoS attack behavior from the massive data stream. In order to solve the problem that large numbers of labeled data are not provided in supervised learning method, and the relatively low detection accuracy and convergence speed of unsupervised k-means algorithm, this paper presents a semisupervised clustering detection method using multiple features. In this detection method, we firstly select three features according to the characteristics of DDoS attacks to form detection feature vector. Then, Multiple-Features-Based Constrained-K-Means (MF-CKM algorithm is proposed based on semisupervised clustering. Finally, using MIT Laboratory Scenario (DDoS 1.0 data set, we verify that the proposed method can improve the convergence speed and accuracy of the algorithm under the condition of using a small amount of labeled data sets.

  8. Using Illness Perceptions to Cluster Chronic Pain Patients

    DEFF Research Database (Denmark)

    Frostholm, Lisbeth; Hornemann, Christina; Ørnbøl, Eva

    2018-01-01

    to participation in a lay-led Chronic Pain Self-Management Program (CPSMP). METHODS: Four hundred and twenty-four participants in a randomized controlled trial on the CPSMP completed a questionnaire on their perceptions of their chronic pain condition at baseline. In addition, they completed a range of health......OBJECTIVES: The aims of our study were (1) To identify possible subgroups of chronic pain sufferers based on their illness perceptions (IPs); (2) To examine whether these subgroups differed in health status and health expenditure, and (3) To examine whether the subgroups differed in their response...... status measures at baseline and three months after end of participation in the CPSMP. Health care expenditure was obtained from Danish health registers. We performed cluster analyses to identify possible subgroups based on the participants' perceptions of their chronic pain condition. RESULTS: Cluster...

  9. ClusTrack: feature extraction and similarity measures for clustering of genome-wide data sets.

    Directory of Open Access Journals (Sweden)

    Halfdan Rydbeck

    Full Text Available Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.

  10. Cluster-guided imaging-based CFD analysis of airflow and particle deposition in asthmatic human lungs

    Science.gov (United States)

    Choi, Jiwoong; Leblanc, Lawrence; Choi, Sanghun; Haghighi, Babak; Hoffman, Eric; Lin, Ching-Long

    2017-11-01

    The goal of this study is to assess inter-subject variability in delivery of orally inhaled drug products to small airways in asthmatic lungs. A recent multiscale imaging-based cluster analysis (MICA) of computed tomography (CT) lung images in an asthmatic cohort identified four clusters with statistically distinct structural and functional phenotypes associating with unique clinical biomarkers. Thus, we aimed to address inter-subject variability via inter-cluster variability. We selected a representative subject from each of the 4 asthma clusters as well as 1 male and 1 female healthy controls, and performed computational fluid and particle simulations on CT-based airway models of these subjects. The results from one severe and one non-severe asthmatic cluster subjects characterized by segmental airway constriction had increased particle deposition efficiency, as compared with the other two cluster subjects (one non-severe and one severe asthmatics) without airway constriction. Constriction-induced jets impinging on distal bifurcations led to excessive particle deposition. The results emphasize the impact of airway constriction on regional particle deposition rather than disease severity, demonstrating the potential of using cluster membership to tailor drug delivery. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837. XSEDE.

  11. Supersymmetry for nuclear cluster systems

    International Nuclear Information System (INIS)

    Levai, G.; Cseh, J.; Van Isacker, P.

    2001-01-01

    A supersymmetry scheme is proposed for nuclear cluster systems. The bosonic sector of the superalgebra describes the relative motion of the clusters, while its fermionic sector is associated with their internal structure. An example of core+α configurations is discussed in which the core is a p-shell nucleus and the underlying superalgebra is U(4/12). The α-cluster states of the nuclei 20 Ne and 19 F are analysed and correlations between their spectra, electric quadrupole transitions, and one-nucleon transfer reactions are interpreted in terms of U(4/12) supersymmetry. (author)

  12. clusterMaker: a multi-algorithm clustering plugin for Cytoscape

    Directory of Open Access Journals (Sweden)

    Morris John H

    2011-11-01

    Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster

  13. Digital Signal Processing Based on a Clustering Algorithm for Ir/Au TES Microcalorimeter

    Science.gov (United States)

    Zen, N.; Kunieda, Y.; Takahashi, H.; Hiramoto, K.; Nakazawa, M.; Fukuda, D.; Ukibe, M.; Ohkubo, M.

    2006-02-01

    In recent years, cryogenic microcalorimeters using their superconducting transition edge have been under development for possible application to the research for astronomical X-ray observations. To improve the energy resolution of superconducting transition edge sensors (TES), several correction methods have been developed. Among them, a clustering method based on digital signal processing has recently been proposed. In this paper, we applied the clustering method to Ir/Au bilayer TES. This method resulted in almost a 10% improvement in the energy resolution. Conversely, from the point of view of imaging X-ray spectroscopy, we applied the clustering method to pixellated Ir/Au-TES devices. We will thus show how a clustering method which sorts signals by their shapes is also useful for position identification

  14. Completion Report for Well Cluster ER-6-1

    Energy Technology Data Exchange (ETDEWEB)

    Bechtel Nevada

    2004-10-01

    Well Cluster ER-6-1 was constructed for the U.S. Department of Energy, National Nuclear Security Administration Nevada Site Office in support of the Nevada Environmental Restoration Division at the Nevada Test Site, Nye County, Nevada. This work was initiated as part of the Groundwater Characterization Project, now known as the Underground Test Area Project. The well cluster is located in southeastern Yucca Flat. Detailed lithologic descriptions with stratigraphic assignments for Well Cluster ER-6-1 are included in this report. These are based on composite drill cuttings collected every 3 meters and conventional core samples taken below 639 meters, supplemented by geophysical log data. Detailed petrographic, chemical, and mineralogical studies of rock samples were conducted on 11 samples to resolve complex interrelationships between several of the Tertiary tuff units. Additionally, paleontological analyses by the U.S. Geological Survey confirmed the stratigraphic assignments below 539 meters within the Paleozoic sedimentary section. All three wells in the Well ER-6-1 cluster were drilled within the Quaternary and Tertiary alluvium section, the Tertiary volcanic section, and into the Paleozoic sedimentary section.

  15. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    Science.gov (United States)

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  16. An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network.

    Science.gov (United States)

    Vimalarani, C; Subramanian, R; Sivanandam, S N

    2016-01-01

    Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption.

  17. A density-based clustering model for community detection in complex networks

    Science.gov (United States)

    Zhao, Xiang; Li, Yantao; Qu, Zehui

    2018-04-01

    Network clustering (or graph partitioning) is an important technique for uncovering the underlying community structures in complex networks, which has been widely applied in various fields including astronomy, bioinformatics, sociology, and bibliometric. In this paper, we propose a density-based clustering model for community detection in complex networks (DCCN). The key idea is to find group centers with a higher density than their neighbors and a relatively large integrated-distance from nodes with higher density. The experimental results indicate that our approach is efficient and effective for community detection of complex networks.

  18. Filaments and clusters of galaxies

    International Nuclear Information System (INIS)

    Soltan, A.

    1987-01-01

    A statistical test to investigate filaments of galaxies is performed. Only particular form of filaments is considered, viz. filaments connecting Abell clusters of galaxies. Relative position of triplets ''cluster - field object - cluster'' is analysed. Though neither cluster sample nor field object sample are homogeneous and complete only peculiar form of selection effects could affect the present statistics. Comparison of observational data with simulations shows that less than 15 per cent of all field galaxies is concentrated in filaments connecting rich clusters. Most of the field objects used in the analysis are not normal galaxies and it is possible that this conclusion is not in conflict with apparent filaments seen in the Lick counts and in some nearby 3D maps of the galaxy distribution. 26 refs., 2 figs. (author)

  19. Clustering for Binary Data Sets by Using Genetic Algorithm-Incremental K-means

    Science.gov (United States)

    Saharan, S.; Baragona, R.; Nor, M. E.; Salleh, R. M.; Asrah, N. M.

    2018-04-01

    This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental K-means (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers.

  20. Trend analysis using non-stationary time series clustering based on the finite element method

    OpenAIRE

    Gorji Sefidmazgi, M.; Sayemuzzaman, M.; Homaifar, A.; Jha, M. K.; Liess, S.

    2014-01-01

    In order to analyze low-frequency variability of climate, it is useful to model the climatic time series with multiple linear trends and locate the times of significant changes. In this paper, we have used non-stationary time series clustering to find change points in the trends. Clustering in a multi-dimensional non-stationary time series is challenging, since the problem is mathematically ill-posed. Clustering based on the finite element method (FEM) is one of the methods ...

  1. Pathway-based analyses.

    Science.gov (United States)

    Kent, Jack W

    2016-02-03

    New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation and penalties for multiple testing. The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge. Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data. The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.

  2. An imbalance in cluster sizes does not lead to notable loss of power in cross-sectional, stepped-wedge cluster randomised trials with a continuous outcome.

    Science.gov (United States)

    Kristunas, Caroline A; Smith, Karen L; Gray, Laura J

    2017-03-07

    The current methodology for sample size calculations for stepped-wedge cluster randomised trials (SW-CRTs) is based on the assumption of equal cluster sizes. However, as is often the case in cluster randomised trials (CRTs), the clusters in SW-CRTs are likely to vary in size, which in other designs of CRT leads to a reduction in power. The effect of an imbalance in cluster size on the power of SW-CRTs has not previously been reported, nor what an appropriate adjustment to the sample size calculation should be to allow for any imbalance. We aimed to assess the impact of an imbalance in cluster size on the power of a cross-sectional SW-CRT and recommend a method for calculating the sample size of a SW-CRT when there is an imbalance in cluster size. The effect of varying degrees of imbalance in cluster size on the power of SW-CRTs was investigated using simulations. The sample size was calculated using both the standard method and two proposed adjusted design effects (DEs), based on those suggested for CRTs with unequal cluster sizes. The data were analysed using generalised estimating equations with an exchangeable correlation matrix and robust standard errors. An imbalance in cluster size was not found to have a notable effect on the power of SW-CRTs. The two proposed adjusted DEs resulted in trials that were generally considerably over-powered. We recommend that the standard method of sample size calculation for SW-CRTs be used, provided that the assumptions of the method hold. However, it would be beneficial to investigate, through simulation, what effect the maximum likely amount of inequality in cluster sizes would be on the power of the trial and whether any inflation of the sample size would be required.

  3. MOCCA-SURVEY Database I: Is NGC 6535 a dark star cluster harbouring an IMBH?

    Science.gov (United States)

    Askar, Abbas; Bianchini, Paolo; de Vita, Ruggero; Giersz, Mirek; Hypki, Arkadiusz; Kamann, Sebastian

    2017-01-01

    We describe the dynamical evolution of a unique type of dark star cluster model in which the majority of the cluster mass at Hubble time is dominated by an intermediate-mass black hole (IMBH). We analysed results from about 2000 star cluster models (Survey Database I) simulated using the Monte Carlo code MOnte Carlo Cluster simulAtor and identified these dark star cluster models. Taking one of these models, we apply the method of simulating realistic `mock observations' by utilizing the Cluster simulatiOn Comparison with ObservAtions (COCOA) and Simulating Stellar Cluster Observation (SISCO) codes to obtain the photometric and kinematic observational properties of the dark star cluster model at 12 Gyr. We find that the perplexing Galactic globular cluster NGC 6535 closely matches the observational photometric and kinematic properties of the dark star cluster model presented in this paper. Based on our analysis and currently observed properties of NGC 6535, we suggest that this globular cluster could potentially harbour an IMBH. If it exists, the presence of this IMBH can be detected robustly with proposed kinematic observations of NGC 6535.

  4. Biomarker clusters are differentially associated with longitudinal cognitive decline in late midlife

    Science.gov (United States)

    Racine, Annie M.; Koscik, Rebecca L.; Berman, Sara E.; Nicholas, Christopher R.; Clark, Lindsay R.; Okonkwo, Ozioma C.; Rowley, Howard A.; Asthana, Sanjay; Bendlin, Barbara B.; Blennow, Kaj; Zetterberg, Henrik; Gleason, Carey E.; Carlsson, Cynthia M.

    2016-01-01

    The ability to detect preclinical Alzheimer’s disease is of great importance, as this stage of the Alzheimer’s continuum is believed to provide a key window for intervention and prevention. As Alzheimer’s disease is characterized by multiple pathological changes, a biomarker panel reflecting co-occurring pathology will likely be most useful for early detection. Towards this end, 175 late middle-aged participants (mean age 55.9 ± 5.7 years at first cognitive assessment, 70% female) were recruited from two longitudinally followed cohorts to undergo magnetic resonance imaging and lumbar puncture. Cluster analysis was used to group individuals based on biomarkers of amyloid pathology (cerebrospinal fluid amyloid-β42/amyloid-β40 assay levels), magnetic resonance imaging-derived measures of neurodegeneration/atrophy (cerebrospinal fluid-to-brain volume ratio, and hippocampal volume), neurofibrillary tangles (cerebrospinal fluid phosphorylated tau181 assay levels), and a brain-based marker of vascular risk (total white matter hyperintensity lesion volume). Four biomarker clusters emerged consistent with preclinical features of (i) Alzheimer’s disease; (ii) mixed Alzheimer’s disease and vascular aetiology; (iii) suspected non-Alzheimer’s disease aetiology; and (iv) healthy ageing. Cognitive decline was then analysed between clusters using longitudinal assessments of episodic memory, semantic memory, executive function, and global cognitive function with linear mixed effects modelling. Cluster 1 exhibited a higher intercept and greater rates of decline on tests of episodic memory. Cluster 2 had a lower intercept on a test of semantic memory and both Cluster 2 and Cluster 3 had steeper rates of decline on a test of global cognition. Additional analyses on Cluster 3, which had the smallest hippocampal volume, suggest that its biomarker profile is more likely due to hippocampal vulnerability and not to detectable specific volume loss exceeding the rate of normal

  5. The X-ray spectra of clusters of galaxies and their relationship to other cluster properties

    International Nuclear Information System (INIS)

    Mitchell, R.J.; Dickens, R.J.; Burnell, S.J.B.; Culhane, J.L.

    1979-01-01

    New observations with the MSSL proportional counter spectrometer on the Ariel V satellite of the X-ray spectra of 20 candidate clusters of galaxies are reported. The data are compared with the results from the OSO-8 satellite and the combined sample of some 30 cluster X-ray spectra are analysed. The present study finds generally larger values of Lsub(X) than do Uhuru or the SSI, which, because of the larger field of view, may indicate significant amounts of hot gas away from the cluster centres. The validity of all X-ray cluster identifications has been examined, and sources have been classified according to certainty of identification. The incidence of X-ray line emission from the clusters has been investigated and temperatures, kTsub(X), have been derived on the basis of an isothermal model. Relationships between X-ray, optical and radio properties of the clusters have been studied. The more massive, centrally condensed clusters generally contain higher temperature gas and have a greater luminosity than the less massive, more irregular clusters. (author)

  6. Rearrangement of cluster structure during fission processes

    DEFF Research Database (Denmark)

    Lyalin, Andrey G.; Obolensky, Oleg I.; Solov'yov, Andrey V.

    2004-01-01

    Results of molecular dynamics simulations of fission reactions $Na_10^2+ -->Na_7^++ Na_3^+ and Na_18^2+--> 2Na_9^+ are presented. The dependence of the fission barriers on the isomer structure of the parent cluster is analysed. It is demonstrated that the energy necessary for removing homothetic...... groups of atoms from the parent cluster is largely independent of the isomer form of the parent cluster. The importance of rearrangement of the cluster structure during the fission process is elucidated. This rearrangement may include transition to another isomer state of the parent cluster before actual...

  7. Radar Emission Sources Identification Based on Hierarchical Agglomerative Clustering for Large Data Sets

    Directory of Open Access Journals (Sweden)

    Janusz Dudczyk

    2016-01-01

    Full Text Available More advanced recognition methods, which may recognize particular copies of radars of the same type, are called identification. The identification process of radar devices is a more specialized task which requires methods based on the analysis of distinctive features. These features are distinguished from the signals coming from the identified devices. Such a process is called Specific Emitter Identification (SEI. The identification of radar emission sources with the use of classic techniques based on the statistical analysis of basic measurable parameters of a signal such as Radio Frequency, Amplitude, Pulse Width, or Pulse Repetition Interval is not sufficient for SEI problems. This paper presents the method of hierarchical data clustering which is used in the process of radar identification. The Hierarchical Agglomerative Clustering Algorithm (HACA based on Generalized Agglomerative Scheme (GAS implemented and used in the research method is parameterized; therefore, it is possible to compare the results. The results of clustering are presented in dendrograms in this paper. The received results of grouping and identification based on HACA are compared with other SEI methods in order to assess the degree of their usefulness and effectiveness for systems of ESM/ELINT class.

  8. Adaptive Reliable Routing Based on Cluster Hierarchy for Wireless Multimedia Sensor Networks

    Directory of Open Access Journals (Sweden)

    Kai Lin

    2010-01-01

    Full Text Available As a multimedia information acquisition and processing method, wireless multimedia sensor network(WMSN has great application potential in military and civilian areas. Compared with traditional wireless sensor network, the routing design of WMSN should obtain more attention on the quality of transmission. This paper proposes an adaptive reliable routing based on clustering hierarchy named ARCH, which includes energy prediction and power allocation mechanism. To obtain a better performance, the cluster structure is formed based on cellular topology. The introduced prediction mechanism makes the sensor nodes predict the remaining energy of other nodes, which dramatically reduces the overall information needed for energy balancing. ARCH can dynamically balance the energy consumption of nodes based on the predicted results provided by power allocation. The simulation results prove the efficiency of the proposed ARCH routing.

  9. CLASH-VLT: constraints on f (R) gravity models with galaxy clusters using lensing and kinematic analyses

    Energy Technology Data Exchange (ETDEWEB)

    Pizzuti, L.; Sartoris, B.; Borgani, S.; Girardi, M., E-mail: pizzuti@oats.inaf.it, E-mail: sartoris@oats.inaf.it, E-mail: borgani@oats.inaf.it, E-mail: girardi@oats.inaf.it [Dipartimento di Fisica, Sezione di Astronomia, Università di Trieste, Via Tiepolo 11, I-34143 Trieste (Italy); and others

    2017-07-01

    We perform a maximum likelihood kinematic analysis of the two dynamically relaxed galaxy clusters MACS J1206.2-0847 at z =0.44 and RXC J2248.7-4431 at z =0.35 to determine the total mass profile in modified gravity models, using a modified version of the MAMPOSSt code of Mamon, Biviano and Bou and apos;e. Our work is based on the kinematic and lensing mass profiles derived using the data from the Cluster Lensing And Supernova survey with Hubble (hereafter CLASH) and the spectroscopic follow-up with the Very Large Telescope (hereafter CLASH-VLT). We assume a spherical Navarro-Frenk-White (NFW hereafter) profile in order to obtain a constraint on the fifth force interaction range λ for models in which the dependence of this parameter on the environment is negligible at the scale considered (i.e. λ= const ) and fixing the fifth force strength to the value predicted in f (R) gravity. We then use information from lensing analysis to put a prior on the other NFW free parameters. In the case of MACSJ 1206 the joint kinematic+lensing analysis leads to an upper limit on the effective interaction range λ≤1.61 mpc at Δχ{sup 2}=2.71 on the marginalized distribution. For RXJ 2248 instead a possible tension with the ΛCDM model appears when adding lensing information, with a lower limit λ≥0.14 mpc at Δχ{sup 2}=2.71. This is consequence of the slight difference between the lensing and kinematic data, appearing in GR for this cluster, that could in principle be explained in terms of modifications of gravity. We discuss the impact of systematics and the limits of our analysis as well as future improvements of the results obtained. This work has interesting implications in view of upcoming and future large imaging and spectroscopic surveys, that will deliver lensing and kinematic mass reconstruction for a large number of galaxy clusters.

  10. Development of New Open-Shell Perturbation and Coupled-Cluster Theories Based on Symmetric Spin Orbitals

    Science.gov (United States)

    Lee, Timothy J.; Arnold, James O. (Technical Monitor)

    1994-01-01

    A new spin orbital basis is employed in the development of efficient open-shell coupled-cluster and perturbation theories that are based on a restricted Hartree-Fock (RHF) reference function. The spin orbital basis differs from the standard one in the spin functions that are associated with the singly occupied spatial orbital. The occupied orbital (in the spin orbital basis) is assigned the delta(+) = 1/square root of 2(alpha+Beta) spin function while the unoccupied orbital is assigned the delta(-) = 1/square root of 2(alpha-Beta) spin function. The doubly occupied and unoccupied orbitals (in the reference function) are assigned the standard alpha and Beta spin functions. The coupled-cluster and perturbation theory wave functions based on this set of "symmetric spin orbitals" exhibit much more symmetry than those based on the standard spin orbital basis. This, together with interacting space arguments, leads to a dramatic reduction in the computational cost for both coupled-cluster and perturbation theory. Additionally, perturbation theory based on "symmetric spin orbitals" obeys Brillouin's theorem provided that spin and spatial excitations are both considered. Other properties of the coupled-cluster and perturbation theory wave functions and models will be discussed.

  11. Computational study of AuSi{sub n} (n=1-9) nanoalloy clusters invoking DFT based descriptors

    Energy Technology Data Exchange (ETDEWEB)

    Ranjan, Prabhat; Kumar, Ajay [Department of Mechatronics, Manipal University Jaipur Dehmi Kalan, Jaipur-303007 (India); Chakraborty, Tanmoy, E-mail: tanmoy.chakraborty@jaipur.manipal.edu, E-mail: tanmoychem@gmail.com [Department of Chemistry, Manipal University Jaipur Dehmi Kalan, Jaipur-303007 (India)

    2016-04-13

    Nanoalloy clusters formed between Au and Si are topics of great interest today from both scientific and technological point of view. Due to its remarkable catalytic, electronic, mechanical and magnetic properties Au-Si nanoalloy clusters have extensive applications in the field of microelectronics, catalysis, biomedicine, and jewelry industry. Density Functional Theory (DFT) is a new paradigm of quantum mechanics, which is very much popular to study the electronic properties of materials. Conceptual DFT based descriptors have been invoked to correlate the experimental properties of nanoalloy clusters. In this venture, we have systematically investigated AuSi{sub n} (n=1-9) nanoalloy clusters in the theoretical frame of the B3LYP exchange correlation. The experimental properties of AuSi{sub n} (n=1-9) nanoalloy clusters are correlated in terms of DFT based descriptors viz. HOMO-LUMO gap, Electronegativity (χ), Global Hardness (η), Global Softness (S) and Electrophilicity Index (ω). The calculated HOMO-LUMO gap exhibits interesting odd-even alteration behaviour, indicating that even numbered clusters possess higher stability as compare to their neighbour odd numbered clusters. This study also reflects a very well agreement between experimental bond length and computed data.

  12. An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network

    Directory of Open Access Journals (Sweden)

    C. Vimalarani

    2016-01-01

    Full Text Available Wireless Sensor Network (WSN is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption.

  13. Comparison Of Keyword Based Clustering Of Web Documents By Using Openstack 4j And By Traditional Method

    Directory of Open Access Journals (Sweden)

    Shiza Anand

    2015-08-01

    Full Text Available As the number of hypertext documents are increasing continuously day by day on world wide web. Therefore clustering methods will be required to bind documents into the clusters repositories according to the similarity lying between the documents. Various clustering methods exist such as Hierarchical Based K-means Fuzzy Logic Based Centroid Based etc. These keyword based clustering methods takes much more amount of time for creating containers and putting documents in their respective containers. These traditional methods use File Handling techniques of different programming languages for creating repositories and transferring web documents into these containers. In contrast openstack4j SDK is a new technique for creating containers and shifting web documents into these containers according to the similarity in much more less amount of time as compared to the traditional methods. Another benefit of this technique is that this SDK understands and reads all types of files such as jpg html pdf doc etc. This paper compares the time required for clustering of documents by using openstack4j and by traditional methods and suggests various search engines to adopt this technique for clustering so that they give result to the user querries in less amount of time.

  14. A Cluster Based Group Signature Mechanism For Secure Vanet Communication

    Directory of Open Access Journals (Sweden)

    Navjot Kaur

    2015-08-01

    Full Text Available Vehicular adhoc network is one of the recent area of research to administer safety to human lives controlling of messages and in disposal of messages to users and passengers. VANETs allows communication of moving vehicular nodes. Movement of nodes leads in changing network size and scenario. Whenever a new node joins the network there is a threat of malicious node attack. So we need an environment that is secure and trust worthy. Therefore a new cluster based secure technique is proposed where cluster head is responsible for providing communication between the vehicular nodes. Performance parameters used in this paper are message drop ratio packet delay ratio and verification time.

  15. Efficacy of community-based physiotherapy networks for patients with Parkinson's disease: a cluster-randomised trial.

    Science.gov (United States)

    Munneke, Marten; Nijkrake, Maarten J; Keus, Samyra Hj; Kwakkel, Gert; Berendse, Henk W; Roos, Raymund Ac; Borm, George F; Adang, Eddy M; Overeem, Sebastiaan; Bloem, Bastiaan R

    2010-01-01

    Many patients with Parkinson's disease are treated with physiotherapy. We have developed a community-based professional network (ParkinsonNet) that involves training of a selected number of expert physiotherapists to work according to evidence-based recommendations, and structured referrals to these trained physiotherapists to increase the numbers of patients they treat. We aimed to assess the efficacy of this approach for improving health-care outcomes. Between February, 2005, and August, 2007, we did a cluster-randomised trial with 16 clusters (defined as community hospitals and their catchment area). Clusters were randomly allocated by use of a variance minimisation algorithm to ParkinsonNet care (n=8) or usual care (n=8). Patients were assessed at baseline and at 8, 16, and 24 weeks of follow-up. The primary outcome was a patient preference disability score, the patient-specific index score, at 16 weeks. Health secondary outcomes were functional mobility, mobility-related quality of life, and total societal costs over 24 weeks. Analysis was by intention to treat. This trial is registered, number NCT00330694. We included 699 patients. Baseline characteristics of the patients were comparable between the ParkinsonNet clusters (n=358) and usual-care clusters (n=341). The primary endpoint was similar for patients within the ParkinsonNet clusters (mean 47.7, SD 21.9) and control clusters (48.3, 22.4). Health secondary endpoints were also similar for patients in both study groups. Total costs over 24 weeks were lower in ParkinsonNet clusters compared with usual-care clusters (difference euro727; 95% CI 56-1399). Implementation of ParkinsonNet networks did not change health outcomes for patients living in ParkinsonNet clusters. However, health-care costs were reduced in ParkinsonNet clusters compared with usual-care clusters. ZonMw; Netherlands Organisation for Scientific Research; Dutch Parkinson's Disease Society; National Parkinson Foundation; Stichting Robuust

  16. A hot X-ray filament associated with A3017 galaxy cluster

    Science.gov (United States)

    Parekh, V.; Durret, F.; Padmanabh, P.; Pandge, M. B.

    2017-09-01

    Recent simulations and observations have shown large-scale filaments in the cosmic web connecting nodes, with accreting materials (baryonic and dark matter) flowing through them. Current high-sensitivity observations also show that the propagation of shocks through filaments can heat them up and make filaments visible between two or more galaxy clusters or around massive clusters, based on optical and/or X-ray observations. We are reporting here the special case of the cluster A3017 associated with a hot filament. The temperature of the filament is 3.4^{-0.77}_{+1.30} keV and its length is ∼1 Mpc. We have analysed its archival Chandra data and report various properties. We also analysed GMRT 235/610 MHz radio data. Radio observations have revealed symmetric two-sided lobes that fill cavities in the A3017 cluster core region, associated with central active galactic nucleus. In the radio map, we also noticed a peculiar linear vertical radio structure in the X-ray filament region which might be associated with a cosmic filament shock. This radio structure could be a radio phoenix or old plasma where an old relativistic population is re-accelerated by shock propagation. Finally, we put an upper limit on the radio luminosity of the filament region.

  17. Cluster evolution

    International Nuclear Information System (INIS)

    Schaeffer, R.

    1987-01-01

    The galaxy and cluster luminosity functions are constructed from a model of the mass distribution based on hierarchical clustering at an epoch where the matter distribution is non-linear. These luminosity functions are seen to reproduce the present distribution of objects as can be inferred from the observations. They can be used to deduce the redshift dependence of the cluster distribution and to extrapolate the observations towards the past. The predicted evolution of the cluster distribution is quite strong, although somewhat less rapid than predicted by the linear theory

  18. Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine.

    Science.gov (United States)

    Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang

    2017-01-01

    Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.

  19. Clustering of dietary intake and sedentary behavior in 2-year-old children.

    Science.gov (United States)

    Gubbels, Jessica S; Kremers, Stef P J; Stafleu, Annette; Dagnelie, Pieter C; de Vries, Sanne I; de Vries, Nanne K; Thijs, Carel

    2009-08-01

    To examine clustering of energy balance-related behaviors (EBRBs) in young children. This is crucial because lifestyle habits are formed at an early age and track in later life. This study is the first to examine EBRB clustering in children as young as 2 years. Cross-sectional data originated from the Child, Parent and Health: Lifestyle and Genetic Constitution (KOALA) Birth Cohort Study. Parents of 2578 2-year-old children completed a questionnaire. Correlation analyses, principal component analyses, and linear regression analyses were performed to examine clustering of EBRBs. We found modest but consistent correlations in EBRBs. Two clusters emerged: a "sedentary-snacking cluster" and a "fiber cluster." Television viewing clustered with computer use and unhealthy dietary behaviors. Children who frequently consumed vegetables also consumed fruit and brown bread more often and white bread less often. Lower maternal education and maternal obesity were associated with high scores on the sedentary-snacking cluster, whereas higher educational level was associated with high fiber cluster scores. Obesity-prone behavioral clusters are already visible in 2-year-old children and are related to maternal characteristics. The findings suggest that obesity prevention should apply an integrated approach to physical activity and dietary intake in early childhood.

  20. Toward demonstrating controlled-X operation based on continuous-variable four-partite cluster states and quantum teleporters

    International Nuclear Information System (INIS)

    Wang Yu; Su Xiaolong; Shen Heng; Tan Aihong; Xie Changde; Peng Kunchi

    2010-01-01

    One-way quantum computation based on measurement and multipartite cluster entanglement offers the ability to perform a variety of unitary operations only through different choices of measurement bases. Here we present an experimental study toward demonstrating the controlled-X operation, a two-mode gate in which continuous variable (CV) four-partite cluster states of optical modes are utilized. Two quantum teleportation elements are used for achieving the gate operation of the quantum state transformation from input target and control states to output states. By means of the optical cluster state prepared off-line, the homodyne detection and electronic feeding forward, the information carried by the input control state is transformed to the output target state. The presented scheme of the controlled-X operation based on teleportation can be implemented nonlocally and deterministically. The distortion of the quantum information resulting from the imperfect cluster entanglement is estimated with the fidelity.

  1. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  2. Beverages-Food Industry Cluster Development Based on Value Chain in Indonesia

    Directory of Open Access Journals (Sweden)

    Lasmono Tri Sunaryanto

    2014-06-01

    Full Text Available This study wants to develop the cluster-based food and beverage industry value chain that corresponds to the potential in the regions in Java Economic Corridor. Targeted research: a description of SME development strategies that have been implemented, composed, and can be applied to an SME cluster development strategy of food and beverage, as well as a proven implementation strategy of SME cluster development of food and beverage. To achieve these objectives, implemented descriptive methods, techniques of data collection through surveys, analysis desk, and the FGD. The data will be analyzed with descriptive statistics. Results of study on PT KML and 46 units of food and drink SMEs in Malang shows that the condition of the SME food-beverage cluster is: not formal, and still as the center. As for the condition of the existence of information technology: the majority of SMEs do not have the PC and only 11% who have it, of which only 23% have a PC that has an internet connection, as well as PC ownership is mostly just used for administration, with WORD and EXCEL programs, and only 4% (1 unit SMEs who use the internet marketing media.

  3. Clustering of samples and elements based on multi-variable chemical data

    International Nuclear Information System (INIS)

    Op de Beeck, J.

    1984-01-01

    Clustering and classification are defined in the context of multivariable chemical analysis data. Classical multi-variate techniques, commonly used to interpret such data, are shown to be based on probabilistic and geometrical principles which are not justified for analytical data, since in that case one assumes or expects a system of more or less systematically related objects (samples) as defined by measurements on more or less systematically interdependent variables (elements). For the specific analytical problem of data set concerning a large number of trace elements determined in a large number of samples, a deterministic cluster analysis can be used to develop the underlying classification structure. Three main steps can be distinguished: diagnostic evaluation and preprocessing of the raw input data; computation of a symmetric matrix with pairwise standardized dissimilarity values between all possible pairs of samples and/or elements; and ultrametric clustering strategy to produce the final classification as a dendrogram. The software packages designed to perform these tasks are discussed and final results are given. Conclusions are formulated concerning the dangers of using multivariate, clustering and classification software packages as a black-box

  4. Clustering of near clusters versus cluster compactness

    International Nuclear Information System (INIS)

    Yu Gao; Yipeng Jing

    1989-01-01

    The clustering properties of near Zwicky clusters are studied by using the two-point angular correlation function. The angular correlation functions for compact and medium compact clusters, for open clusters, and for all near Zwicky clusters are estimated. The results show much stronger clustering for compact and medium compact clusters than for open clusters, and that open clusters have nearly the same clustering strength as galaxies. A detailed study of the compactness-dependence of correlation function strength is worth investigating. (author)

  5. Single cyanide-bridged Mo(W)/S/Cu cluster-based coordination polymers: Reactant- and stoichiometry-dependent syntheses, effective photocatalytic properties

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Jinfang, E-mail: zjf260@jiangnan.edu.cn [China-Australia Joint Research Center for Functional Molecular Materials, School of Chemical and Material Engineering, Jiangnan University, Wuxi 214122 (China); Wang, Chao [China-Australia Joint Research Center for Functional Molecular Materials, School of Chemical and Material Engineering, Jiangnan University, Wuxi 214122 (China); Wang, Yinlin; Chen, Weitao [China-Australia Joint Research Center for Functional Molecular Materials, Scientific Research Academy, Jiangsu University, Zhenjiang 212013 (China); Cifuentes, Marie P.; Humphrey, Mark G. [Research School of Chemistry, Australian National University, Canberra ACT 0200 (Australia); Zhang, Chi, E-mail: chizhang@jiangnan.edu.cn [China-Australia Joint Research Center for Functional Molecular Materials, School of Chemical and Material Engineering, Jiangnan University, Wuxi 214122 (China)

    2015-11-15

    The systematic study on the reaction variables affecting single cyanide-bridged Mo(W)/S/Cu cluster-based coordination polymers (CPs) is firstly demonstrated. Five anionic single cyanide-bridged Mo(W)/S/Cu cluster-based CPs {[Pr_4N][WS_4Cu_3(CN)_2]}{sub n} (1), {[Pr_4N][WS_4Cu_4(CN)_3]}{sub n} (2), {[Pr_4N][WOS_3Cu_3(CN)_2]}{sub n} (3), {[Bu_4N][WOS_3Cu_3(CN)_2]}{sub n} (4) and {[Bu_4N][MoOS_3Cu_3(CN)_2]}{sub n} (5) were prepared by varying the molar ratios of the starting materials, and the specific cations, cluster building blocks and central metal atoms in the cluster building blocks. 1 possesses an anionic 3D diamondoid framework constructed from 4-connected T-shaped clusters [WS{sub 4}Cu{sub 3}]{sup +} and single CN{sup −} bridges. 2 is fabricated from 6-connected planar ‘open’ clusters [WS{sub 4}Cu{sub 4}]{sup 2+} and single CN{sup −} bridges, forming an anionic 3D architecture with an “ACS” topology. 3 and 4 exhibit novel anionic 2-D double-layer networks, both constructed from nest-shaped clusters [WOS{sub 3}Cu{sub 3}]{sup +} linked by single CN{sup −} bridges, but containing the different cations [Pr{sub 4}N]{sup +} and [Bu{sub 4}N]{sup +}, respectively. 5 is constructed from nest-shaped clusters [MoOS{sub 3}Cu{sub 3}]{sup +} and single CN{sup −} bridges, with an anionic 3D diamondoid framework. The anionic frameworks of 1-5, all sustained by single CN{sup −} bridges, are non-interpenetrating and exhibit huge potential void volumes. Employing differing molar ratios of the reactants and varying the cluster building blocks resulted in differing single cyanide-bridged Mo(W)/S/Cu cluster-based CPs, while replacing the cation ([Pr{sub 4}N]{sup +} vs. [Bu{sub 4}N]{sup +}) was found to have negligible impact on the nature of the architecture. Unexpectedly, replacement of the central metal atom (W vs. Mo) in the cluster building blocks had a pronounced effect on the framework. Furthermore, the photocatalytic activities of heterothiometallic

  6. Single cyanide-bridged Mo(W)/S/Cu cluster-based coordination polymers: Reactant- and stoichiometry-dependent syntheses, effective photocatalytic properties

    International Nuclear Information System (INIS)

    Zhang, Jinfang; Wang, Chao; Wang, Yinlin; Chen, Weitao; Cifuentes, Marie P.; Humphrey, Mark G.; Zhang, Chi

    2015-01-01

    The systematic study on the reaction variables affecting single cyanide-bridged Mo(W)/S/Cu cluster-based coordination polymers (CPs) is firstly demonstrated. Five anionic single cyanide-bridged Mo(W)/S/Cu cluster-based CPs {[Pr_4N][WS_4Cu_3(CN)_2]}_n (1), {[Pr_4N][WS_4Cu_4(CN)_3]}_n (2), {[Pr_4N][WOS_3Cu_3(CN)_2]}_n (3), {[Bu_4N][WOS_3Cu_3(CN)_2]}_n (4) and {[Bu_4N][MoOS_3Cu_3(CN)_2]}_n (5) were prepared by varying the molar ratios of the starting materials, and the specific cations, cluster building blocks and central metal atoms in the cluster building blocks. 1 possesses an anionic 3D diamondoid framework constructed from 4-connected T-shaped clusters [WS_4Cu_3]"+ and single CN"− bridges. 2 is fabricated from 6-connected planar ‘open’ clusters [WS_4Cu_4]"2"+ and single CN"− bridges, forming an anionic 3D architecture with an “ACS” topology. 3 and 4 exhibit novel anionic 2-D double-layer networks, both constructed from nest-shaped clusters [WOS_3Cu_3]"+ linked by single CN"− bridges, but containing the different cations [Pr_4N]"+ and [Bu_4N]"+, respectively. 5 is constructed from nest-shaped clusters [MoOS_3Cu_3]"+ and single CN"− bridges, with an anionic 3D diamondoid framework. The anionic frameworks of 1-5, all sustained by single CN"− bridges, are non-interpenetrating and exhibit huge potential void volumes. Employing differing molar ratios of the reactants and varying the cluster building blocks resulted in differing single cyanide-bridged Mo(W)/S/Cu cluster-based CPs, while replacing the cation ([Pr_4N]"+ vs. [Bu_4N]"+) was found to have negligible impact on the nature of the architecture. Unexpectedly, replacement of the central metal atom (W vs. Mo) in the cluster building blocks had a pronounced effect on the framework. Furthermore, the photocatalytic activities of heterothiometallic cluster-based CPs were firstly explored by monitoring the photodegradation of methylene blue (MB) under visible light irradiation, which reveals that 2

  7. Electricity Consumption Clustering Using Smart Meter Data

    Directory of Open Access Journals (Sweden)

    Alexander Tureczek

    2018-04-01

    Full Text Available Electricity smart meter consumption data is enabling utilities to analyze consumption information at unprecedented granularity. Much focus has been directed towards consumption clustering for diversifying tariffs; through modern clustering methods, cluster analyses have been performed. However, the clusters developed exhibit a large variation with resulting shadow clusters, making it impossible to truly identify the individual clusters. Using clearly defined dwelling types, this paper will present methods to improve clustering by harvesting inherent structure from the smart meter data. This paper clusters domestic electricity consumption using smart meter data from the Danish city of Esbjerg. Methods from time series analysis and wavelets are applied to enable the K-Means clustering method to account for autocorrelation in data and thereby improve the clustering performance. The results show the importance of data knowledge and we identify sub-clusters of consumption within the dwelling types and enable K-Means to produce satisfactory clustering by accounting for a temporal component. Furthermore our study shows that careful preprocessing of the data to account for intrinsic structure enables better clustering performance by the K-Means method.

  8. Cluster-cluster clustering

    International Nuclear Information System (INIS)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)

    1985-01-01

    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references

  9. Customer Clustering Based on Customer Purchasing Sequence Data

    OpenAIRE

    Yen-Chung Liu; Yen-Liang Chen

    2017-01-01

    Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer...

  10. Crowd Analysis by Using Optical Flow and Density Based Clustering

    DEFF Research Database (Denmark)

    Santoro, Francesco; Pedro, Sergio; Tan, Zheng-Hua

    2010-01-01

    In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step...

  11. A Cluster-Based Framework for the Security of Medical Sensor Environments

    Science.gov (United States)

    Klaoudatou, Eleni; Konstantinou, Elisavet; Kambourakis, Georgios; Gritzalis, Stefanos

    The adoption of Wireless Sensor Networks (WSNs) in the healthcare sector poses many security issues, mainly because medical information is considered particularly sensitive. The security mechanisms employed are expected to be more efficient in terms of energy consumption and scalability in order to cope with the constrained capabilities of WSNs and patients’ mobility. Towards this goal, cluster-based medical WSNs can substantially improve efficiency and scalability. In this context, we have proposed a general framework for cluster-based medical environments on top of which security mechanisms can rely. This framework fully covers the varying needs of both in-hospital environments and environments formed ad hoc for medical emergencies. In this paper, we further elaborate on the security of our proposed solution. We specifically focus on key establishment mechanisms and investigate the group key agreement protocols that can best fit in our framework.

  12. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  13. A fast readout algorithm for Cluster Counting/Timing drift chambers on a FPGA board

    Energy Technology Data Exchange (ETDEWEB)

    Cappelli, L. [Università di Cassino e del Lazio Meridionale (Italy); Creti, P.; Grancagnolo, F. [Istituto Nazionale di Fisica Nucleare, Lecce (Italy); Pepino, A., E-mail: Aurora.Pepino@le.infn.it [Istituto Nazionale di Fisica Nucleare, Lecce (Italy); Tassielli, G. [Istituto Nazionale di Fisica Nucleare, Lecce (Italy); Fermilab, Batavia, IL (United States); Università Marconi, Roma (Italy)

    2013-08-01

    A fast readout algorithm for Cluster Counting and Timing purposes has been implemented and tested on a Virtex 6 core FPGA board. The algorithm analyses and stores data coming from a Helium based drift tube instrumented by 1 GSPS fADC and represents the outcome of balancing between cluster identification efficiency and high speed performance. The algorithm can be implemented in electronics boards serving multiple fADC channels as an online preprocessing stage for drift chamber signals.

  14. Measuring social capital through multivariate analyses for the IQ-SC.

    Science.gov (United States)

    Campos, Ana Cristina Viana; Borges, Carolina Marques; Vargas, Andréa Maria Duarte; Gomes, Viviane Elisangela; Lucas, Simone Dutra; Ferreira e Ferreira, Efigênia

    2015-01-20

    Social capital can be viewed as a societal process that works toward the common good as well as toward the good of the collective based on trust, reciprocity, and solidarity. Our study aimed to present two multivariate statistical analyses to examine the formation of latent classes of social capital using the IQ-SC and to identify the most important factors in building an indicator of individual social capital. A cross-sectional study was conducted in 2009 among working adolescents supported by a Brazilian NGO. The sample consisted of 363 individuals, and data were collected using the World Bank Questionnaire for measuring social capital. First, the participants were grouped by a segmentation analysis using the Two Step Cluster method based on the Euclidian distance and the centroid criteria as the criteria for aggregate answers. Using specific weights for each item, discriminant analysis was used to validate the cluster analysis in an attempt to maximize the variance among the groups with respect to the variance within the clusters. "Community participation" and "trust in one's neighbors" contributed significantly to the development of the model with two distinct discriminant functions (p < 0.001). The majority of cases (95.0%) and non-cases (93.1%) were correctly classified by discriminant analysis. The two multivariate analyses (segmentation analysis and canonical discriminant analysis), used together, can be considered good choices for measuring social capital. Our results indicate that it is possible to form three social capital groups (low, medium and high) using the IQ-SC.

  15. Cluster-based adaptive power control protocol using Hidden Markov Model for Wireless Sensor Networks

    Science.gov (United States)

    Vinutha, C. B.; Nalini, N.; Nagaraja, M.

    2017-06-01

    This paper presents strategies for an efficient and dynamic transmission power control technique, in order to reduce packet drop and hence energy consumption of power-hungry sensor nodes operated in highly non-linear channel conditions of Wireless Sensor Networks. Besides, we also focus to prolong network lifetime and scalability by designing cluster-based network structure. Specifically we consider weight-based clustering approach wherein, minimum significant node is chosen as Cluster Head (CH) which is computed stemmed from the factors distance, remaining residual battery power and received signal strength (RSS). Further, transmission power control schemes to fit into dynamic channel conditions are meticulously implemented using Hidden Markov Model (HMM) where probability transition matrix is formulated based on the observed RSS measurements. Typically, CH estimates initial transmission power of its cluster members (CMs) from RSS using HMM and broadcast this value to its CMs for initialising their power value. Further, if CH finds that there are variations in link quality and RSS of the CMs, it again re-computes and optimises the transmission power level of the nodes using HMM to avoid packet loss due noise interference. We have demonstrated our simulation results to prove that our technique efficiently controls the power levels of sensing nodes to save significant quantity of energy for different sized network.

  16. K-means-clustering-based fiber nonlinearity equalization techniques for 64-QAM coherent optical communication system.

    Science.gov (United States)

    Zhang, Junfeng; Chen, Wei; Gao, Mingyi; Shen, Gangxiang

    2017-10-30

    In this work, we proposed two k-means-clustering-based algorithms to mitigate the fiber nonlinearity for 64-quadrature amplitude modulation (64-QAM) signal, the training-sequence assisted k-means algorithm and the blind k-means algorithm. We experimentally demonstrated the proposed k-means-clustering-based fiber nonlinearity mitigation techniques in 75-Gb/s 64-QAM coherent optical communication system. The proposed algorithms have reduced clustering complexity and low data redundancy and they are able to quickly find appropriate initial centroids and select correctly the centroids of the clusters to obtain the global optimal solutions for large k value. We measured the bit-error-ratio (BER) performance of 64-QAM signal with different launched powers into the 50-km single mode fiber and the proposed techniques can greatly mitigate the signal impairments caused by the amplified spontaneous emission noise and the fiber Kerr nonlinearity and improve the BER performance.

  17. Simultaneous Co-Clustering and Classification in Customers Insight

    Science.gov (United States)

    Anggistia, M.; Saefuddin, A.; Sartono, B.

    2017-04-01

    Building predictive model based on the heterogeneous dataset may yield many problems, such as less precise in parameter and prediction accuracy. Such problem can be solved by segmenting the data into relatively homogeneous groups and then build a predictive model for each cluster. The advantage of using this strategy usually gives result in simpler models, more interpretable, and more actionable without any loss in accuracy and reliability. This work concerns on marketing data set which recorded a customer behaviour across products. There are some variables describing customer and product as attributes. The basic idea of this approach is to combine co-clustering and classification simultaneously. The objective of this research is to analyse the customer across product characteristics, so the marketing strategy implemented precisely.

  18. Group sequential designs for stepped-wedge cluster randomised trials.

    Science.gov (United States)

    Grayling, Michael J; Wason, James Ms; Mander, Adrian P

    2017-10-01

    The stepped-wedge cluster randomised trial design has received substantial attention in recent years. Although various extensions to the original design have been proposed, no guidance is available on the design of stepped-wedge cluster randomised trials with interim analyses. In an individually randomised trial setting, group sequential methods can provide notable efficiency gains and ethical benefits. We address this by discussing how established group sequential methodology can be adapted for stepped-wedge designs. Utilising the error spending approach to group sequential trial design, we detail the assumptions required for the determination of stepped-wedge cluster randomised trials with interim analyses. We consider early stopping for efficacy, futility, or efficacy and futility. We describe first how this can be done for any specified linear mixed model for data analysis. We then focus on one particular commonly utilised model and, using a recently completed stepped-wedge cluster randomised trial, compare the performance of several designs with interim analyses to the classical stepped-wedge design. Finally, the performance of a quantile substitution procedure for dealing with the case of unknown variance is explored. We demonstrate that the incorporation of early stopping in stepped-wedge cluster randomised trial designs could reduce the expected sample size under the null and alternative hypotheses by up to 31% and 22%, respectively, with no cost to the trial's type-I and type-II error rates. The use of restricted error maximum likelihood estimation was found to be more important than quantile substitution for controlling the type-I error rate. The addition of interim analyses into stepped-wedge cluster randomised trials could help guard against time-consuming trials conducted on poor performing treatments and also help expedite the implementation of efficacious treatments. In future, trialists should consider incorporating early stopping of some kind into

  19. Diametrical clustering for identifying anti-correlated gene clusters.

    Science.gov (United States)

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  20. Deployment Strategy for Car-Sharing Depots by Clustering Urban Traffic Big Data Based on Affinity Propagation

    Directory of Open Access Journals (Sweden)

    Zhihan Liu

    2018-01-01

    Full Text Available Car sharing is a type of car rental service, by which consumers rent cars for short periods of time, often charged by hours. The analysis of urban traffic big data is full of importance and significance to determine locations of depots for car-sharing system. Taxi OD (Origin-Destination is a typical dataset of urban traffic. The volume of the data is extremely large so that traditional data processing applications do not work well. In this paper, an optimization method to determine the depot locations by clustering taxi OD points with AP (Affinity Propagation clustering algorithm has been presented. By analyzing the characteristics of AP clustering algorithm, AP clustering has been optimized hierarchically based on administrative region segmentation. Considering sparse similarity matrix of taxi OD points, the input parameters of AP clustering have been adapted. In the case study, we choose the OD pairs information from Beijing’s taxi GPS trajectory data. The number and locations of depots are determined by clustering the OD points based on the optimization AP clustering. We describe experimental results of our approach and compare it with standard K-means method using quantitative and stationarity index. Experiments on the real datasets show that the proposed method for determining car-sharing depots has a superior performance.

  1. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

    Science.gov (United States)

    Huang, W.; Li, S.; Xu, S.

    2016-06-01

    How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the

  3. A ROBUST CLUSTER HEAD SELECTION BASED ON NEIGHBORHOOD CONTRIBUTION AND AVERAGE MINIMUM POWER FOR MANETs

    Directory of Open Access Journals (Sweden)

    S.Balaji

    2015-06-01

    Full Text Available Mobile Adhoc network is an instantaneous wireless network that is dynamic in nature. It supports single hop and multihop communication. In this infrastructure less network, clustering is a significant model to maintain the topology of the network. The clustering process includes different phases like cluster formation, cluster head selection, cluster maintenance. Choosing cluster head is important as the stability of the network depends on well-organized and resourceful cluster head. When the node has increased number of neighbors it can act as a link between the neighbor nodes which in further reduces the number of hops in multihop communication. Promisingly the node with more number of neighbors should also be available with enough energy to provide stability in the network. Hence these aspects demand the focus. In weight based cluster head selection, closeness and average minimum power required is considered for purging the ineligible nodes. The optimal set of nodes selected after purging will compete to become cluster head. The node with maximum weight selected as cluster head. Mathematical formulation is developed to show the proposed method provides optimum result. It is also suggested that weight factor in calculating the node weight should give precise importance to energy and node stability.

  4. Ecosystem health pattern analysis of urban clusters based on emergy synthesis: Results and implication for management

    International Nuclear Information System (INIS)

    Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan

    2013-01-01

    The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis

  5. Supplier Risk Assessment Based on Best-Worst Method and K-Means Clustering: A Case Study

    Directory of Open Access Journals (Sweden)

    Merve Er Kara

    2018-04-01

    Full Text Available Supplier evaluation and selection is one of the most critical strategic decisions for developing a competitive and sustainable organization. Companies have to consider supplier related risks and threats in their purchasing decisions. In today’s competitive and risky business environment, it is very important to work with reliable suppliers. This study proposes a clustering based approach to group suppliers based on their risk profile. Suppliers of a company in the heavy-machinery sector are assessed based on 17 qualitative and quantitative risk types. The weights of the criteria are determined by using the Best-Worst method. Four factors are extracted by applying Factor Analysis to the supplier risk data. Then k-means clustering algorithm is applied to group core suppliers of the company based on the four risk factors. Three clusters are created with different risk exposure levels. The interpretation of the results provides insights for risk management actions and supplier development programs to mitigate supplier risk.

  6. Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering.

    Science.gov (United States)

    Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar

    2015-01-01

    Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed.

  7. [Predicting Incidence of Hepatitis E in Chinausing Fuzzy Time Series Based on Fuzzy C-Means Clustering Analysis].

    Science.gov (United States)

    Luo, Yi; Zhang, Tao; Li, Xiao-song

    2016-05-01

    To explore the application of fuzzy time series model based on fuzzy c-means clustering in forecasting monthly incidence of Hepatitis E in mainland China. Apredictive model (fuzzy time series method based on fuzzy c-means clustering) was developed using Hepatitis E incidence data in mainland China between January 2004 and July 2014. The incidence datafrom August 2014 to November 2014 were used to test the fitness of the predictive model. The forecasting results were compared with those resulted from traditional fuzzy time series models. The fuzzy time series model based on fuzzy c-means clustering had 0.001 1 mean squared error (MSE) of fitting and 6.977 5 x 10⁻⁴ MSE of forecasting, compared with 0.0017 and 0.0014 from the traditional forecasting model. The results indicate that the fuzzy time series model based on fuzzy c-means clustering has a better performance in forecasting incidence of Hepatitis E.

  8. 3D Building Models Segmentation Based on K-Means++ Cluster Analysis

    Science.gov (United States)

    Zhang, C.; Mao, B.

    2016-10-01

    3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model) 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid) and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.

  9. 3D BUILDING MODELS SEGMENTATION BASED ON K-MEANS++ CLUSTER ANALYSIS

    Directory of Open Access Journals (Sweden)

    C. Zhang

    2016-10-01

    Full Text Available 3D mesh model segmentation is drawing increasing attentions from digital geometry processing field in recent years. The original 3D mesh model need to be divided into separate meaningful parts or surface patches based on certain standards to support reconstruction, compressing, texture mapping, model retrieval and etc. Therefore, segmentation is a key problem for 3D mesh model segmentation. In this paper, we propose a method to segment Collada (a type of mesh model 3D building models into meaningful parts using cluster analysis. Common clustering methods segment 3D mesh models by K-means, whose performance heavily depends on randomized initial seed points (i.e., centroid and different randomized centroid can get quite different results. Therefore, we improved the existing method and used K-means++ clustering algorithm to solve this problem. Our experiments show that K-means++ improves both the speed and the accuracy of K-means, and achieve good and meaningful results.

  10. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  11. Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.

    Directory of Open Access Journals (Sweden)

    Ujjwal Maulik

    Full Text Available Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of "recent" paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request.sarkar@labri.fr.

  12. A clustering based method to evaluate soil corrosivity for pipeline external integrity management

    International Nuclear Information System (INIS)

    Yajima, Ayako; Wang, Hui; Liang, Robert Y.; Castaneda, Homero

    2015-01-01

    One important category of transportation infrastructure is underground pipelines. Corrosion of these buried pipeline systems may cause pipeline failures with the attendant hazards of property loss and fatalities. Therefore, developing the capability to estimate the soil corrosivity is important for designing and preserving materials and for risk assessment. The deterioration rate of metal is highly influenced by the physicochemical characteristics of a material and the environment of its surroundings. In this study, the field data obtained from the southeast region of Mexico was examined using various data mining techniques to determine the usefulness of these techniques for clustering soil corrosivity level. Specifically, the soil was classified into different corrosivity level clusters by k-means and Gaussian mixture model (GMM). In terms of physical space, GMM shows better separability; therefore, the distributions of the material loss of the buried petroleum pipeline walls were estimated via the empirical density within GMM clusters. The soil corrosivity levels of the clusters were determined based on the medians of metal loss. The proposed clustering method was demonstrated to be capable of classifying the soil into different levels of corrosivity severity. - Highlights: • The clustering approach is applied to the data extracted from a real-life pipeline system. • Soil properties in the right-of-way are analyzed via clustering techniques to assess corrosivity. • GMM is selected as the preferred method for detecting the hidden pattern of in-situ data. • K–W test is performed for significant difference of corrosivity level between clusters

  13. The combination of a histogram-based clustering algorithm and support vector machine for the diagnosis of osteoporosis

    International Nuclear Information System (INIS)

    Heo, Min Suk; Kavitha, Muthu Subash; Asano, Akira; Taguchi, Akira

    2013-01-01

    To prevent low bone mineral density (BMD), that is, osteoporosis, in postmenopausal women, it is essential to diagnose osteoporosis more precisely. This study presented an automatic approach utilizing a histogram-based automatic clustering (HAC) algorithm with a support vector machine (SVM) to analyse dental panoramic radiographs (DPRs) and thus improve diagnostic accuracy by identifying postmenopausal women with low BMD or osteoporosis. We integrated our newly-proposed histogram-based automatic clustering (HAC) algorithm with our previously-designed computer-aided diagnosis system. The extracted moment-based features (mean, variance, skewness, and kurtosis) of the mandibular cortical width for the radial basis function (RBF) SVM classifier were employed. We also compared the diagnostic efficacy of the SVM model with the back propagation (BP) neural network model. In this study, DPRs and BMD measurements of 100 postmenopausal women patients (aged >50 years), with no previous record of osteoporosis, were randomly selected for inclusion. The accuracy, sensitivity, and specificity of the BMD measurements using our HAC-SVM model to identify women with low BMD were 93.0% (88.0%-98.0%), 95.8% (91.9%-99.7%) and 86.6% (79.9%-93.3%), respectively, at the lumbar spine; and 89.0% (82.9%-95.1%), 96.0% (92.2%-99.8%) and 84.0% (76.8%-91.2%), respectively, at the femoral neck. Our experimental results predict that the proposed HAC-SVM model combination applied on DPRs could be useful to assist dentists in early diagnosis and help to reduce the morbidity and mortality associated with low BMD and osteoporosis.

  14. Bipartite entanglement in continuous variable cluster states

    Energy Technology Data Exchange (ETDEWEB)

    Cable, Hugo; Browne, Daniel E, E-mail: cqthvc@nus.edu.s, E-mail: d.browne@ucl.ac.u [Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543 (Singapore)

    2010-11-15

    A study of the entanglement properties of Gaussian cluster states, proposed as a universal resource for continuous variable (CV) quantum computing is presented in this paper. The central aim is to compare mathematically idealized cluster states defined using quadrature eigenstates, which have infinite squeezing and cannot exist in nature, with Gaussian approximations that are experimentally accessible. Adopting widely used definitions, we first review the key concepts, by analysing a process of teleportation along a CV quantum wire in the language of matrix product states. Next we consider the bipartite entanglement properties of the wire, providing analytic results. We proceed to grid cluster states, which are universal for the qubit case. To extend our analysis of the bipartite entanglement, we adopt the entropic-entanglement width, a specialized entanglement measure introduced recently by Van den Nest et al (2006 Phys. Rev. Lett. 97 150504), adapting their definition to the CV context. Finally, we consider the effects of photonic loss, extending our arguments to mixed states. Cumulatively our results point to key differences in the properties of idealized and Gaussian cluster states. Even modest loss rates are found to strongly limit the amount of entanglement. We discuss the implications for the potential of CV analogues for measurement-based quantum computation.

  15. Genetic algorithm based two-mode clustering of metabolomics data

    NARCIS (Netherlands)

    Hageman, J.A.; van den Berg, R.A.; Westerhuis, J.A.; van der Werf, M.J.; Smilde, A.K.

    2008-01-01

    Metabolomics and other omics tools are generally characterized by large data sets with many variables obtained under different environmental conditions. Clustering methods and more specifically two-mode clustering methods are excellent tools for analyzing this type of data. Two-mode clustering

  16. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  17. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  18. Discrimination of neutrons and γ-rays in liquid scintillators based of fuzzy c-means clustering

    International Nuclear Information System (INIS)

    Luo Xiaoliang; Liu Guofu; Yang Jun

    2011-01-01

    A novel method based on fuzzy c-means (FCM) clustering for the discrimination of neutrons and γ-rays in liquid scintillators was presented. The neutrons and γ-rays in the environment were firstly acquired by the portable real-time n-γ discriminator and then discriminated using fuzzy c-means clustering and pulse gradient analysis, respectively. By comparing the results with each other, it is shown that the discrimination results of the fuzzy c-means clustering are consistent with those of the pulse gradient analysis. The decrease in uncertainty and the improvement in discrimination performance of the fuzzy c-means clustering were also observed. (authors)

  19. On-line learning from clustered input examples

    NARCIS (Netherlands)

    Riegler, Peter; Biehl, Michael; Solla, Sara A.; Marangi, Carmela; Marinaro, Maria; Tagliaferri, Roberto

    1996-01-01

    We analyse on-line learning of a linearly separable rule with a simple perceptron. Example inputs are taken from two overlapping clusters of data and the rule is defined through a teacher vector which is in general not aligned with the connection line of the cluster centers. We find that the Hebb

  20. ADVANCED CLUSTER BASED IMAGE SEGMENTATION

    Directory of Open Access Journals (Sweden)

    D. Kesavaraja

    2011-11-01

    Full Text Available This paper presents efficient and portable implementations of a useful image segmentation technique which makes use of the faster and a variant of the conventional connected components algorithm which we call parallel Components. In the Modern world majority of the doctors are need image segmentation as the service for various purposes and also they expect this system is run faster and secure. Usually Image segmentation Algorithms are not working faster. In spite of several ongoing researches in Conventional Segmentation and its Algorithms might not be able to run faster. So we propose a cluster computing environment for parallel image Segmentation to provide faster result. This paper is the real time implementation of Distributed Image Segmentation in Clustering of Nodes. We demonstrate the effectiveness and feasibility of our method on a set of Medical CT Scan Images. Our general framework is a single address space, distributed memory programming model. We use efficient techniques for distributing and coalescing data as well as efficient combinations of task and data parallelism. The image segmentation algorithm makes use of an efficient cluster process which uses a novel approach for parallel merging. Our experimental results are consistent with the theoretical analysis and practical results. It provides the faster execution time for segmentation, when compared with Conventional method. Our test data is different CT scan images from the Medical database. More efficient implementations of Image Segmentation will likely result in even faster execution times.

  1. An Energy-Efficient Spectrum-Aware Reinforcement Learning-Based Clustering Algorithm for Cognitive Radio Sensor Networks.

    Science.gov (United States)

    Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal

    2015-08-13

    It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach.

  2. A hybrid method based on a new clustering technique and multilayer perceptron neural networks for hourly solar radiation forecasting

    International Nuclear Information System (INIS)

    Azimi, R.; Ghayekhloo, M.; Ghofrani, M.

    2016-01-01

    Highlights: • A novel clustering approach is proposed based on the data transformation approach. • A novel cluster selection method based on correlation analysis is presented. • The proposed hybrid clustering approach leads to deep learning for MLPNN. • A hybrid forecasting method is developed to predict solar radiations. • The evaluation results show superior performance of the proposed forecasting model. - Abstract: Accurate forecasting of renewable energy sources plays a key role in their integration into the grid. This paper proposes a hybrid solar irradiance forecasting framework using a Transformation based K-means algorithm, named TB K-means, to increase the forecast accuracy. The proposed clustering method is a combination of a new initialization technique, K-means algorithm and a new gradual data transformation approach. Unlike the other K-means based clustering methods which are not capable of providing a fixed and definitive answer due to the selection of different cluster centroids for each run, the proposed clustering provides constant results for different runs of the algorithm. The proposed clustering is combined with a time-series analysis, a novel cluster selection algorithm and a multilayer perceptron neural network (MLPNN) to develop the hybrid solar radiation forecasting method for different time horizons (1 h ahead, 2 h ahead, …, 48 h ahead). The performance of the proposed TB K-means clustering is evaluated using several different datasets and compared with different variants of K-means algorithm. Solar datasets with different solar radiation characteristics are also used to determine the accuracy and processing speed of the developed forecasting method with the proposed TB K-means and other clustering techniques. The results of direct comparison with other well-established forecasting models demonstrate the superior performance of the proposed hybrid forecasting method. Furthermore, a comparative analysis with the benchmark solar

  3. Global myeloma research clusters, output, and citations: a bibliometric mapping and clustering analysis.

    Directory of Open Access Journals (Sweden)

    Jens Peter Andersen

    Full Text Available International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases.We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005-2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013.Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care.

  4. The relationship between supplier networks and industrial clusters: an analysis based on the cluster mapping method

    Directory of Open Access Journals (Sweden)

    Ichiro IWASAKI

    2010-06-01

    Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.

  5. Automated three-dimensional morphology-based clustering of human erythrocytes with regular shapes: stomatocytes, discocytes, and echinocytes

    Science.gov (United States)

    Ahmadzadeh, Ezat; Jaferzadeh, Keyvan; Lee, Jieun; Moon, Inkyu

    2017-07-01

    We present unsupervised clustering methods for automatic grouping of human red blood cells (RBCs) extracted from RBC quantitative phase images obtained by digital holographic microscopy into three RBC clusters with regular shapes, including biconcave, stomatocyte, and sphero-echinocyte. We select some good features related to the RBC profile and morphology, such as RBC average thickness, sphericity coefficient, and mean corpuscular volume, and clustering methods, including density-based spatial clustering applications with noise, k-medoids, and k-means, are applied to the set of morphological features. The clustering results of RBCs using a set of three-dimensional features are compared against a set of two-dimensional features. Our experimental results indicate that by utilizing the introduced set of features, two groups of biconcave RBCs and old RBCs (suffering from the sphero-echinocyte process) can be perfectly clustered. In addition, by increasing the number of clusters, the three RBC types can be effectively clustered in an automated unsupervised manner with high accuracy. The performance evaluation of the clustering techniques reveals that they can assist hematologists in further diagnosis.

  6. Photo-induced transformation process at gold clusters-semiconductor interface: Implications for the complexity of gold clusters-based photocatalysis

    Science.gov (United States)

    Liu, Siqi; Xu, Yi-Jun

    2016-03-01

    The recent thrust in utilizing atomically precise organic ligands protected gold clusters (Au clusters) as photosensitizer coupled with semiconductors for nano-catalysts has led to the claims of improved efficiency in photocatalysis. Nonetheless, the influence of photo-stability of organic ligands protected-Au clusters at the Au/semiconductor interface on the photocatalytic properties remains rather elusive. Taking Au clusters-TiO2 composites as a prototype, we for the first time demonstrate the photo-induced transformation of small molecular-like Au clusters to larger metallic Au nanoparticles under different illumination conditions, which leads to the diverse photocatalytic reaction mechanism. This transformation process undergoes a diffusion/aggregation mechanism accompanied with the onslaught of Au clusters by active oxygen species and holes resulting from photo-excited TiO2 and Au clusters. However, such Au clusters aggregation can be efficiently inhibited by tuning reaction conditions. This work would trigger rational structural design and fine condition control of organic ligands protected-metal clusters-semiconductor composites for diverse photocatalytic applications with long-term photo-stability.

  7. ON THE ACCURACY OF WEAK-LENSING CLUSTER MASS RECONSTRUCTIONS

    International Nuclear Information System (INIS)

    Becker, Matthew R.; Kravtsov, Andrey V.

    2011-01-01

    We study the bias and scatter in mass measurements of galaxy clusters resulting from fitting a spherically symmetric Navarro, Frenk, and White model to the reduced tangential shear profile measured in weak-lensing (WL) observations. The reduced shear profiles are generated for ∼10 4 cluster-sized halos formed in a ΛCDM cosmological N-body simulation of a 1 h -1 Gpc box. In agreement with previous studies, we find that the scatter in the WL masses derived using this fitting method has irreducible contributions from the triaxial shapes of cluster-sized halos and uncorrelated large-scale matter projections along the line of sight. Additionally, we find that correlated large-scale structure within several virial radii of clusters contributes a smaller, but nevertheless significant, amount to the scatter. The intrinsic scatter due to these physical sources is ∼20% for massive clusters and can be as high as ∼30% for group-sized systems. For current, ground-based observations, however, the total scatter should be dominated by shape noise from the background galaxies used to measure the shear. Importantly, we find that WL mass measurements can have a small, ∼5%-10%, but non-negligible amount of bias. Given that WL measurements of cluster masses are a powerful way to calibrate cluster mass-observable relations for precision cosmological constraints, we strongly emphasize that a robust calibration of the bias requires detailed simulations that include more observational effects than we consider here. Such a calibration exercise needs to be carried out for each specific WL mass estimation method, as the details of the method determine in part the expected scatter and bias. We present an iterative method for estimating mass M 500c that can eliminate the bias for analyses of ground-based data.

  8. Multi-Optimisation Consensus Clustering

    Science.gov (United States)

    Li, Jian; Swift, Stephen; Liu, Xiaohui

    Ensemble Clustering has been developed to provide an alternative way of obtaining more stable and accurate clustering results. It aims to avoid the biases of individual clustering algorithms. However, it is still a challenge to develop an efficient and robust method for Ensemble Clustering. Based on an existing ensemble clustering method, Consensus Clustering (CC), this paper introduces an advanced Consensus Clustering algorithm called Multi-Optimisation Consensus Clustering (MOCC), which utilises an optimised Agreement Separation criterion and a Multi-Optimisation framework to improve the performance of CC. Fifteen different data sets are used for evaluating the performance of MOCC. The results reveal that MOCC can generate more accurate clustering results than the original CC algorithm.

  9. Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Gclust Server Cluster based on sequence comparison of homologous proteins of 95 organism spe...cies Data detail Data name Cluster based on sequence comparison of homologous proteins of 95 organism specie...istory of This Database Site Policy | Contact Us Cluster based on sequence compariso

  10. Clustering and firm performance in project-based industries : the case of the global video game industry, 1972-2007

    NARCIS (Netherlands)

    Vaan, de M.; Boschma, R.A.; Frenken, K.

    2013-01-01

    Explanations of spatial clustering based on localization externalities are being questioned by recent empirical evidence showing that firms in clusters do not outperform firms outside clusters. We propose that these findings may be driven by the particularities of the industrial settings chosen in

  11. Clustering and firm performance in project-based industries: the case of the global video game industry, 1972-2007

    NARCIS (Netherlands)

    Vaan, M. de; Boschma, R.; Frenken, K.

    2013-01-01

    Explanations of spatial clustering based on localization externalities are being questioned by recent empirical evidence showing that firms in clusters do not outperform firms outside clusters. We propose that these findings may be driven by the particularities of the industrial settings chosen

  12. An AK-LDMeans algorithm based on image clustering

    Science.gov (United States)

    Chen, Huimin; Li, Xingwei; Zhang, Yongbin; Chen, Nan

    2018-03-01

    Clustering is an effective analytical technique for handling unmarked data for value mining. Its ultimate goal is to mark unclassified data quickly and correctly. We use the roadmap for the current image processing as the experimental background. In this paper, we propose an AK-LDMeans algorithm to automatically lock the K value by designing the Kcost fold line, and then use the long-distance high-density method to select the clustering centers to further replace the traditional initial clustering center selection method, which further improves the efficiency and accuracy of the traditional K-Means Algorithm. And the experimental results are compared with the current clustering algorithm and the results are obtained. The algorithm can provide effective reference value in the fields of image processing, machine vision and data mining.

  13. Electronic structure and properties of designer clusters and cluster-assemblies

    International Nuclear Information System (INIS)

    Khanna, S.N.; Jena, P.

    1995-01-01

    Using self-consistent calculations based on density functional theory, we demonstrate that electronic shell filling and close atomic packing criteria can be used to design ultra-stable clusters. Interaction of these clusters with each other and with gas atoms is found to be weak confirming their chemical inertness. A crystal composed of these inert clusters is expected to have electronic properties that are markedly different from crystals where atoms are the building blocks. The recent observation of ferromagnetism in potassium clusters assembled in zeolite cages is discussed. (orig.)

  14. Performance criteria for graph clustering and Markov cluster experiments

    NARCIS (Netherlands)

    S. van Dongen

    2000-01-01

    textabstractIn~[1] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL~algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in~[2] establish an intrinsic

  15. Improving cluster-based missing value estimation of DNA microarray data.

    Science.gov (United States)

    Brás, Lígia P; Menezes, José C

    2007-06-01

    We present a modification of the weighted K-nearest neighbours imputation method (KNNimpute) for missing values (MVs) estimation in microarray data based on the reuse of estimated data. The method was called iterative KNN imputation (IKNNimpute) as the estimation is performed iteratively using the recently estimated values. The estimation efficiency of IKNNimpute was assessed under different conditions (data type, fraction and structure of missing data) by the normalized root mean squared error (NRMSE) and the correlation coefficients between estimated and true values, and compared with that of other cluster-based estimation methods (KNNimpute and sequential KNN). We further investigated the influence of imputation on the detection of differentially expressed genes using SAM by examining the differentially expressed genes that are lost after MV estimation. The performance measures give consistent results, indicating that the iterative procedure of IKNNimpute can enhance the prediction ability of cluster-based methods in the presence of high missing rates, in non-time series experiments and in data sets comprising both time series and non-time series data, because the information of the genes having MVs is used more efficiently and the iterative procedure allows refining the MV estimates. More importantly, IKNN has a smaller detrimental effect on the detection of differentially expressed genes.

  16. Cluster decay half-lives of trans-lead nuclei based on a finite-range nucleon–nucleon interaction

    Energy Technology Data Exchange (ETDEWEB)

    Adel, A., E-mail: aa.ahmed@mu.edu.sa [Physics Department, Faculty of Science, Cairo University, Giza (Egypt); Physics Department, College of Science, Majmaah University, Zulfi (Saudi Arabia); Alharbi, T. [Physics Department, College of Science, Majmaah University, Zulfi (Saudi Arabia)

    2017-02-15

    Nuclear cluster radioactivity is investigated using microscopic potentials in the framework of the Wentzel–Kramers–Brillouin approximation of quantum tunneling by considering the Bohr–Sommerfeld quantization condition. The microscopic cluster–daughter potential is numerically constructed in the well-established double-folding model. A realistic M3Y-Paris NN interaction with the finite-range exchange part as well as the ordinary zero-range exchange NN force is considered in the present work. The influence of nuclear deformations on the cluster decay half-lives is investigated. Based on the available experimental data, the cluster preformation factors are extracted from the calculated and the measured half lives of cluster radioactivity. Some useful predictions of cluster emission half-lives are made for emissions of known clusters from possible candidates, which may guide future experiments.

  17. Cluster Ion Implantation in Graphite and Diamond

    DEFF Research Database (Denmark)

    Popok, Vladimir

    2014-01-01

    Cluster ion beam technique is a versatile tool which can be used for controllable formation of nanosize objects as well as modification and processing of surfaces and shallow layers on an atomic scale. The current paper present an overview and analysis of data obtained on a few sets of graphite...... and diamond samples implanted by keV-energy size-selected cobalt and argon clusters. One of the emphases is put on pinning of metal clusters on graphite with a possibility of following selective etching of graphene layers. The other topic of concern is related to the development of scaling law for cluster...... implantation. Implantation of cobalt and argon clusters into two different allotropic forms of carbon, namely, graphite and diamond is analysed and compared in order to approach universal theory of cluster stopping in matter....

  18. A Spectrum Sensing Method Based on Signal Feature and Clustering Algorithm in Cognitive Wireless Multimedia Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yongwei Zhang

    2017-01-01

    Full Text Available In order to solve the problem of difficulty in determining the threshold in spectrum sensing technologies based on the random matrix theory, a spectrum sensing method based on clustering algorithm and signal feature is proposed for Cognitive Wireless Multimedia Sensor Networks. Firstly, the wireless communication signal features are obtained according to the sampling signal covariance matrix. Then, the clustering algorithm is used to classify and test the signal features. Different signal features and clustering algorithms are compared in this paper. The experimental results show that the proposed method has better sensing performance.

  19. Detecting space-time cancer clusters using residential histories

    Science.gov (United States)

    Jacquez, Geoffrey M.; Meliker, Jaymie R.

    2007-04-01

    Methods for analyzing geographic clusters of disease typically ignore the space-time variability inherent in epidemiologic datasets, do not adequately account for known risk factors (e.g., smoking and education) or covariates (e.g., age, gender, and race), and do not permit investigation of the latency window between exposure and disease. Our research group recently developed Q-statistics for evaluating space-time clustering in cancer case-control studies with residential histories. This technique relies on time-dependent nearest neighbor relationships to examine clustering at any moment in the life-course of the residential histories of cases relative to that of controls. In addition, in place of the widely used null hypothesis of spatial randomness, each individual's probability of being a case is instead based on his/her risk factors and covariates. Case-control clusters will be presented using residential histories of 220 bladder cancer cases and 440 controls in Michigan. In preliminary analyses of this dataset, smoking, age, gender, race and education were sufficient to explain the majority of the clustering of residential histories of the cases. Clusters of unexplained risk, however, were identified surrounding the business address histories of 10 industries that emit known or suspected bladder cancer carcinogens. The clustering of 5 of these industries began in the 1970's and persisted through the 1990's. This systematic approach for evaluating space-time clustering has the potential to generate novel hypotheses about environmental risk factors. These methods may be extended to detect differences in space-time patterns of any two groups of people, making them valuable for security intelligence and surveillance operations.

  20. GIS-based Approaches to Catchment Area Analyses of Mass Transit

    DEFF Research Database (Denmark)

    Andersen, Jonas Lohmann Elkjær; Landex, Alex

    2009-01-01

    Catchment area analyses of stops or stations are used to investigate potential number of travelers to public transportation. These analyses are considered a strong decision tool in the planning process of mass transit especially railroads. Catchment area analyses are GIS-based buffer and overlay...... analyses with different approaches depending on the desired level of detail. A simple but straightforward approach to implement is the Circular Buffer Approach where catchment areas are circular. A more detailed approach is the Service Area Approach where catchment areas are determined by a street network...... search to simulate the actual walking distances. A refinement of the Service Area Approach is to implement additional time resistance in the network search to simulate obstacles in the walking environment. This paper reviews and compares the different GIS-based catchment area approaches, their level...

  1. Analyser-based phase contrast image reconstruction using geometrical optics.

    Science.gov (United States)

    Kitchen, M J; Pavlov, K M; Siu, K K W; Menk, R H; Tromba, G; Lewis, R A

    2007-07-21

    Analyser-based phase contrast imaging can provide radiographs of exceptional contrast at high resolution (geometrical optics are satisfied. Analytical phase retrieval can be performed by fitting the analyser rocking curve with a symmetric Pearson type VII function. The Pearson VII function provided at least a 10% better fit to experimentally measured rocking curves than linear or Gaussian functions. A test phantom, a hollow nylon cylinder, was imaged at 20 keV using a Si(1 1 1) analyser at the ELETTRA synchrotron radiation facility. Our phase retrieval method yielded a more accurate object reconstruction than methods based on a linear fit to the rocking curve. Where reconstructions failed to map expected values, calculations of the Takagi number permitted distinction between the violation of the geometrical optics conditions and the failure of curve fitting procedures. The need for synchronized object/detector translation stages was removed by using a large, divergent beam and imaging the object in segments. Our image acquisition and reconstruction procedure enables quantitative phase retrieval for systems with a divergent source and accounts for imperfections in the analyser.

  2. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  3. Investigation on IMCP based clustering in LTE-M communication for smart metering applications

    Directory of Open Access Journals (Sweden)

    Kartik Vishal Deshpande

    2017-06-01

    Full Text Available Machine to Machine (M2M is foreseen as an emerging technology for smart metering applications where devices communicate seamlessly for information transfer. The M2M communication makes use of long term evolution (LTE as its backbone network and it results in long-term evolution for machine type communication (LTE-M network. As huge number of M2M devices is to be handled by single eNB (evolved Node B, clustering is exploited for efficient processing of the network. This paper investigates the proposed Improved M2M Clustering Process (IMCP based clustering technique and it is compared with two well-known clustering algorithms, namely, Low Energy Adaptive Clustering Hierarchical (LEACH and Energy Aware Multihop Multipath Hierarchical (EAMMH techniques. Further, the IMCP algorithm is analyzed with two-tier and three-tier M2M systems for various mobility conditions. The proposed IMCP algorithm improves the last node death by 63.15% and 51.61% as compared to LEACH and EAMMH, respectively. Further, the average energy of each node in IMCP is increased by 89.85% and 81.15%, as compared to LEACH and EAMMH, respectively.

  4. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  5. Hybrid clustering based fuzzy structure for vibration control - Part 1: A novel algorithm for building neuro-fuzzy system

    Science.gov (United States)

    Nguyen, Sy Dzung; Nguyen, Quoc Hung; Choi, Seung-Bok

    2015-01-01

    This paper presents a new algorithm for building an adaptive neuro-fuzzy inference system (ANFIS) from a training data set called B-ANFIS. In order to increase accuracy of the model, the following issues are executed. Firstly, a data merging rule is proposed to build and perform a data-clustering strategy. Subsequently, a combination of clustering processes in the input data space and in the joint input-output data space is presented. Crucial reason of this task is to overcome problems related to initialization and contradictory fuzzy rules, which usually happen when building ANFIS. The clustering process in the input data space is accomplished based on a proposed merging-possibilistic clustering (MPC) algorithm. The effectiveness of this process is evaluated to resume a clustering process in the joint input-output data space. The optimal parameters obtained after completion of the clustering process are used to build ANFIS. Simulations based on a numerical data, 'Daily Data of Stock A', and measured data sets of a smart damper are performed to analyze and estimate accuracy. In addition, convergence and robustness of the proposed algorithm are investigated based on both theoretical and testing approaches.

  6. Researches on the Security of Cluster-based Communication Protocol for Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yanhong Sun

    2014-08-01

    Full Text Available Along with the in-depth application of sensor networks, the security issues have gradually become the bottleneck of wireless sensor applications. To provide a solution for security scheme is a common concern not only of researchers but also of providers, integrators and users of wireless sensor networks. Based on this demand, this paper focuses on the research of strengthening the security of cluster-based wireless sensor networks. Based on the systematic analysis of the clustering protocol and its security enhancement scheme, the paper introduces the broadcast authentication scheme, and proposes an SA-LEACH network security enhancement protocol. The performance analysis and simulation experiments prove that the protocol consumes less energy with the same security requirements, and when the base station is comparatively far from the network deployment area, it is more advantageous in terms of energy consumption and t more suitable for wireless sensor networks.

  7. Ethical implications of excessive cluster sizes in cluster randomised trials.

    Science.gov (United States)

    Hemming, Karla; Taljaard, Monica; Forbes, Gordon; Eldridge, Sandra M; Weijer, Charles

    2018-02-20

    The cluster randomised trial (CRT) is commonly used in healthcare research. It is the gold-standard study design for evaluating healthcare policy interventions. A key characteristic of this design is that as more participants are included, in a fixed number of clusters, the increase in achievable power will level off. CRTs with cluster sizes that exceed the point of levelling-off will have excessive numbers of participants, even if they do not achieve nominal levels of power. Excessively large cluster sizes may have ethical implications due to exposing trial participants unnecessarily to the burdens of both participating in the trial and the potential risks of harm associated with the intervention. We explore these issues through the use of two case studies. Where data are routinely collected, available at minimum cost and the intervention poses low risk, the ethical implications of excessively large cluster sizes are likely to be low (case study 1). However, to maximise the social benefit of the study, identification of excessive cluster sizes can allow for prespecified and fully powered secondary analyses. In the second case study, while there is no burden through trial participation (because the outcome data are routinely collected and non-identifiable), the intervention might be considered to pose some indirect risk to patients and risks to the healthcare workers. In this case study it is therefore important that the inclusion of excessively large cluster sizes is justifiable on other grounds (perhaps to show sustainability). In any randomised controlled trial, including evaluations of health policy interventions, it is important to minimise the burdens and risks to participants. Funders, researchers and research ethics committees should be aware of the ethical issues of excessively large cluster sizes in cluster trials. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is

  8. Refined tropical curve counts and canonical bases for quantum cluster algebras

    DEFF Research Database (Denmark)

    Mandel, Travis

    We express the (quantizations of the) Gross-Hacking-Keel-Kontsevich canonical bases for cluster algebras in terms of certain (Block-Göttsche) weighted counts of tropical curves. In the process, we obtain via scattering diagram techniques a new invariance result for these Block-Göttsche counts....

  9. Cluster-based centralized data fusion for tracking maneuvering ...

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    In this scheme, measurements are sent to the data fusion centre where the mea- ... using 'clusters' (a cluster by definition is a type of parallel or distributed processing ... working together as a single, integrated computing resource) is proposed.

  10. An improved clustering algorithm based on reverse learning in intelligent transportation

    Science.gov (United States)

    Qiu, Guoqing; Kou, Qianqian; Niu, Ting

    2017-05-01

    With the development of artificial intelligence and data mining technology, big data has gradually entered people's field of vision. In the process of dealing with large data, clustering is an important processing method. By introducing the reverse learning method in the clustering process of PAM clustering algorithm, to further improve the limitations of one-time clustering in unsupervised clustering learning, and increase the diversity of clustering clusters, so as to improve the quality of clustering. The algorithm analysis and experimental results show that the algorithm is feasible.

  11. A THREE-STEP SPATIAL-TEMPORAL-SEMANTIC CLUSTERING METHOD FOR HUMAN ACTIVITY PATTERN ANALYSIS

    Directory of Open Access Journals (Sweden)

    W. Huang

    2016-06-01

    Full Text Available How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time to four dimensions (space, time and semantics. More specifically, not only a location and time that people stay and spend are collected, but also what people “say” for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The

  12. An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

    Directory of Open Access Journals (Sweden)

    Dawen Xia

    2015-01-01

    Full Text Available Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs. Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K-Means (Par3PKM algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K-Means and then employ a MapReduce paradigm to redesign the optimized K-Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K-Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.

  13. A survey of energy conservation mechanisms for dynamic cluster based wireless sensor networks

    International Nuclear Information System (INIS)

    Enam, R.N.; Tahir, M.; Ahmed, S.; Qureshi, R.

    2018-01-01

    WSN (Wireless Sensor Network) is an emerging technology that has unlimited potential for numerous application areas including military, crisis management, environmental, transportation, medical, home/ city automations and smart spaces. But energy constrained nature of WSNs necessitates that their architecture and communicating protocols to be designed in an energy aware manner. Sensor data collection through clustering mechanisms has become a common strategy in WSN. This paper presents a survey report on the major perspectives with which energy conservation mechanisms has been proposed in dynamic cluster based WSNs so far. All the solutions discussed in this paper focus on the cluster based protocols only.We have covered a vast scale of existing energy efficient protocols and have categorized them in six categories. In the beginning of this paper the fundamentals of the energy constraint issues of WSNs have been discussed and an overview of the causes of energy consumptions at all layers of WSN has been given. Later in this paper several previously proposed energy efficient protocols of WSNs are presented. (author)

  14. A Survey of Energy Conservation Mechanisms for Dynamic Cluster Based Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Rabia Noor Enam

    2018-04-01

    Full Text Available WSN (Wireless Sensor Network is an emerging technology that has unlimited potential for numerous application areas including military, crisis management, environmental, transportation, medical, home/ city automations and smart spaces. But energy constrained nature of WSNs necessitates that their architecture and communicating protocols to be designed in an energy aware manner. Sensor data collection through clustering mechanisms has become a common strategy in WSN. This paper presents a survey report on the major perspectives with which energy conservation mechanisms has been proposed in dynamic cluster based WSNs so far. All the solutions discussed in this paper focus on the cluster based protocols only.We have covered a vast scale of existing energy efficient protocols and have categorized them in six categories. In the beginning of this paper the fundamentals of the energy constraint issues of WSNs have been discussed and an overview of the causes of energy consumptions at all layers of WSN has been given. Later in this paper several previously proposed energy efficient protocols of WSNs are presented.

  15. A LOOP-BASED APPROACH IN CLUSTERING AND ROUTING IN MOBILE AD HOC NETWORKS

    Institute of Scientific and Technical Information of China (English)

    Li Yanping; Wang Xin; Xue Xiangyang; C.K. Toh

    2006-01-01

    Although clustering is a convenient framework to enable traffic control and service support in Mobile Ad hoc NETworks (MANETs), it is seldom adopted in practice due to the additional traffic overhead it leads to for the resource limited ad hoc network. In order to address this problem, we proposed a loop-based approach to combine clustering and routing. By employing loop topologies, topology information is disseminated with a loop instead of a single node, which provides better robustness, and the nature of a loop that there are two paths between each pair of nodes within a loop suggests smart route recovery strategy. Our approach is composed of setup procedure, regular procedure and recovery procedure to achieve clustering, routing and emergent route recovering.

  16. α/β-particle radiation identification based on fuzzy C-means clustering

    International Nuclear Information System (INIS)

    Yang Yijianxia; Yang Lu; Li Wenqiang

    2013-01-01

    A pulse shape recognition method based on fuzzy C-means clustering for the discrimination of α/βparticle was presented. A detection circuit to isolate α/β-particles is designed. Using a single probe scintillating detector to acquire α/β particles. By comparing the results to pulse amplitude analysis, it is shown that by Fuzzy C-means clustering α-particle count rate increased by 42.9% and the cross-talk ratio of α-β is decreased by 15.9% for 6190 cps 0420 αsource; β-particle count rate increased by 31.8% and the cross -talk ratio of β-α is decreased by 7.7% for 05-05β source. (authors)

  17. Substructure in clusters of galaxies

    International Nuclear Information System (INIS)

    Fitchett, M.J.

    1988-01-01

    Optical observations suggesting the existence of substructure in clusters of galaxies are examined. Models of cluster formation and methods used to detect substructure in clusters are reviewed. Consideration is given to classification schemes based on a departure of bright cluster galaxies from a spherically symmetric distribution, evidence for statistically significant substructure, and various types of substructure, including velocity, spatial, and spatial-velocity substructure. The substructure observed in the galaxy distribution in clusters is discussed, focusing on observations from general cluster samples, the Virgo cluster, the Hydra cluster, Centaurus, the Coma cluster, and the Cancer cluster. 88 refs

  18. A Novel Wireless Power Transfer-Based Weighed Clustering Cooperative Spectrum Sensing Method for Cognitive Sensor Networks.

    Science.gov (United States)

    Liu, Xin

    2015-10-30

    In a cognitive sensor network (CSN), the wastage of sensing time and energy is a challenge to cooperative spectrum sensing, when the number of cooperative cognitive nodes (CNs) becomes very large. In this paper, a novel wireless power transfer (WPT)-based weighed clustering cooperative spectrum sensing model is proposed, which divides all the CNs into several clusters, and then selects the most favorable CNs as the cluster heads and allows the common CNs to transfer the received radio frequency (RF) energy of the primary node (PN) to the cluster heads, in order to supply the electrical energy needed for sensing and cooperation. A joint resource optimization is formulated to maximize the spectrum access probability of the CSN, through jointly allocating sensing time and clustering number. According to the resource optimization results, a clustering algorithm is proposed. The simulation results have shown that compared to the traditional model, the cluster heads of the proposed model can achieve more transmission power and there exists optimal sensing time and clustering number to maximize the spectrum access probability.

  19. A Novel Wireless Power Transfer-Based Weighed Clustering Cooperative Spectrum Sensing Method for Cognitive Sensor Networks

    Directory of Open Access Journals (Sweden)

    Xin Liu

    2015-10-01

    Full Text Available In a cognitive sensor network (CSN, the wastage of sensing time and energy is a challenge to cooperative spectrum sensing, when the number of cooperative cognitive nodes (CNs becomes very large. In this paper, a novel wireless power transfer (WPT-based weighed clustering cooperative spectrum sensing model is proposed, which divides all the CNs into several clusters, and then selects the most favorable CNs as the cluster heads and allows the common CNs to transfer the received radio frequency (RF energy of the primary node (PN to the cluster heads, in order to supply the electrical energy needed for sensing and cooperation. A joint resource optimization is formulated to maximize the spectrum access probability of the CSN, through jointly allocating sensing time and clustering number. According to the resource optimization results, a clustering algorithm is proposed. The simulation results have shown that compared to the traditional model, the cluster heads of the proposed model can achieve more transmission power and there exists optimal sensing time and clustering number to maximize the spectrum access probability.

  20. Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

    Science.gov (United States)

    Muraoka, Masae; Okuda, Hiroshi

    With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.

  1. Interplay between experiments and calculations for organometallic clusters and caged clusters

    International Nuclear Information System (INIS)

    Nakajima, Atsushi

    2015-01-01

    Clusters consisting of 10-1000 atoms exhibit size-dependent electronic and geometric properties. In particular, composite clusters consisting of several elements and/or components provide a promising way for a bottom-up approach for designing functional advanced materials, because the functionality of the composite clusters can be optimized not only by the cluster size but also by their compositions. In the formation of composite clusters, their geometric symmetry and dimensionality are emphasized to control the physical and chemical properties, because selective and anisotropic enhancements for optical, chemical, and magnetic properties can be expected. Organometallic clusters and caged clusters are demonstrated as a representative example of designing the functionality of the composite clusters. Organometallic vanadium-benzene forms a one dimensional sandwich structure showing ferromagnetic behaviors and anomalously large HOMO-LUMO gap differences of two spin orbitals, which can be regarded as spin-filter components for cluster-based spintronic devices. Caged clusters of aluminum (Al) are well stabilized both geometrically and electronically at Al 12 X, behaving as a “superatom”

  2. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    Science.gov (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  3. CLUSTER ANALYSIS OF TOTAL ASSETS PROVIDED BY BANKS FROM FOUR CONTINENTS

    Directory of Open Access Journals (Sweden)

    MIRELA CĂTĂLINA TÜRKEȘ

    2017-08-01

    Full Text Available The paper analysed the total assets in 2016 achieved by the strongest 96 banks from 4 continents: Europe, America, Asia and Africa. It aims to evaluate the level of total assets provided by banks in 2016 and continental banking markets degree of differentiation to determine the overall conditions of the banks. Methodologies used in this study are based on cluster and descriptives analysis. Data set was built based on informations reported by banks on total assets. The results indicate that most of total banking assets are found in Asia and the fewest in Africa. At the end of 2016, the top 16 global banks owned total assets of $ 30.19 trillion according to the data set contains cluster 1 and the centroid was (2.25, 2.11, 3.06, 0.01.

  4. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    International Nuclear Information System (INIS)

    Harmon, S; Wendelberger, B; Jeraj, R

    2014-01-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [ 18 F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI mean = 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI range : 0.2301–1). Conclusion: Using commonly-used clustering algorithms, we found

  5. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Harmon, S; Wendelberger, B [University of Wisconsin-Madison, Madison, WI (United States); Jeraj, R [University of Wisconsin-Madison, Madison, WI (United States); University of Ljubljana (Slovenia)

    2014-06-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  6. Application of clustering analysis in the prediction of photovoltaic power generation based on neural network

    Science.gov (United States)

    Cheng, K.; Guo, L. M.; Wang, Y. K.; Zafar, M. T.

    2017-11-01

    In order to select effective samples in the large number of data of PV power generation years and improve the accuracy of PV power generation forecasting model, this paper studies the application of clustering analysis in this field and establishes forecasting model based on neural network. Based on three different types of weather on sunny, cloudy and rainy days, this research screens samples of historical data by the clustering analysis method. After screening, it establishes BP neural network prediction models using screened data as training data. Then, compare the six types of photovoltaic power generation prediction models before and after the data screening. Results show that the prediction model combining with clustering analysis and BP neural networks is an effective method to improve the precision of photovoltaic power generation.

  7. How clustering dynamics influence lumber utilization patterns in the Amish-based furniture industry in Ohio

    Science.gov (United States)

    Matthew S. Bumgardner; Gary W. Graham; P. Charles Goebel; Robert L. Romig

    2011-01-01

    Preliminary studies have suggested that the Amish-based furniture and related products manufacturing cluster located in and around Holmes County, Ohio, uses sizeable quantities of hardwood lumber. The number of firms within the cluster has grown even as the broader domestic furniture manufacturing sector has contracted. The present study was undertaken in 2008 (spring/...

  8. Difference-based clustering of short time-course microarray data with replicates

    Directory of Open Access Journals (Sweden)

    Kim Jihoon

    2007-07-01

    Full Text Available Abstract Background There are some limitations associated with conventional clustering methods for short time-course gene expression data. The current algorithms require prior domain knowledge and do not incorporate information from replicates. Moreover, the results are not always easy to interpret biologically. Results We propose a novel algorithm for identifying a subset of genes sharing a significant temporal expression pattern when replicates are used. Our algorithm requires no prior knowledge, instead relying on an observed statistic which is based on the first and second order differences between adjacent time-points. Here, a pattern is predefined as the sequence of symbols indicating direction and the rate of change between time-points, and each gene is assigned to a cluster whose members share a similar pattern. We evaluated the performance of our algorithm to those of K-means, Self-Organizing Map and the Short Time-series Expression Miner methods. Conclusions Assessments using simulated and real data show that our method outperformed aforementioned algorithms. Our approach is an appropriate solution for clustering short time-course microarray data with replicates.

  9. Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster

    Directory of Open Access Journals (Sweden)

    Tao Ran

    2016-01-01

    Full Text Available As Hadoop has gained popularity in big data era, it is widely used in various fields. The self-design and self-developed large-scale network traffic analysis cluster works well based on Hadoop, with off-line applications running on it to analyze the massive network traffic data. On purpose of scientifically and reasonably evaluating the performance of analysis cluster, we propose a performance evaluation system. Firstly, we set the execution times of three benchmark applications as the benchmark of the performance, and pick 40 metrics of customized statistical resource data. Then we identify the relationship between the resource data and the execution times by a statistic modeling analysis approach, which is composed of principal component analysis and multiple linear regression. After training models by historical data, we can predict the execution times by current resource data. Finally, we evaluate the performance of analysis cluster by the validated predicting of execution times. Experimental results show that the predicted execution times by trained models are within acceptable error range, and the evaluation results of performance are accurate and reliable.

  10. Cluster synchronization induced by one-node clusters in networks with asymmetric negative couplings

    International Nuclear Information System (INIS)

    Zhang, Jianbao; Ma, Zhongjun; Zhang, Gang

    2013-01-01

    This paper deals with the problem of cluster synchronization in networks with asymmetric negative couplings. By decomposing the coupling matrix into three matrices, and employing Lyapunov function method, sufficient conditions are derived for cluster synchronization. The conditions show that the couplings of multi-node clusters from one-node clusters have beneficial effects on cluster synchronization. Based on the effects of the one-node clusters, an effective and universal control scheme is put forward for the first time. The obtained results may help us better understand the relation between cluster synchronization and cluster structures of the networks. The validity of the control scheme is confirmed through two numerical simulations, in a network with no cluster structure and in a scale-free network

  11. Cluster synchronization induced by one-node clusters in networks with asymmetric negative couplings

    Science.gov (United States)

    Zhang, Jianbao; Ma, Zhongjun; Zhang, Gang

    2013-12-01

    This paper deals with the problem of cluster synchronization in networks with asymmetric negative couplings. By decomposing the coupling matrix into three matrices, and employing Lyapunov function method, sufficient conditions are derived for cluster synchronization. The conditions show that the couplings of multi-node clusters from one-node clusters have beneficial effects on cluster synchronization. Based on the effects of the one-node clusters, an effective and universal control scheme is put forward for the first time. The obtained results may help us better understand the relation between cluster synchronization and cluster structures of the networks. The validity of the control scheme is confirmed through two numerical simulations, in a network with no cluster structure and in a scale-free network.

  12. Performance Evaluation of Incremental K-means Clustering Algorithm

    OpenAIRE

    Chakraborty, Sanjay; Nagwani, N. K.

    2014-01-01

    The incremental K-means clustering algorithm has already been proposed and analysed in paper [Chakraborty and Nagwani, 2011]. It is a very innovative approach which is applicable in periodically incremental environment and dealing with a bulk of updates. In this paper the performance evaluation is done for this incremental K-means clustering algorithm using air pollution database. This paper also describes the comparison on the performance evaluations between existing K-means clustering and i...

  13. A highly accurate positioning and orientation system based on the usage of four-cluster fibre optic gyros

    International Nuclear Information System (INIS)

    Zhang, Xiaoyue; Lin, Zhili; Zhang, Chunxi

    2013-01-01

    A highly accurate positioning and orientation technique based on four-cluster fibre optic gyros (FOGs) is presented. The four-cluster FOG inertial measurement unit (IMU) comprises three low-precision FOGs, one static high-precision FOG and three accelerometers. To realize high-precision positioning and orientation, the static alignment (north-seeking) before vehicle manoeuvre was divided into a low-precision self-alignment phase and a high-precision north-seeking (online calibration) phase. The high-precision FOG measurement information was introduced to obtain high-precision azimuth alignment (north-seeking) result and achieve online calibration of the low-precision three-cluster FOG. The results of semi-physical simulation were presented to validate the availability and utility of the highly accurate positioning and orientation technique based on the four-cluster FOGs. (paper)

  14. Meta-Analyses of Human Cell-Based Cardiac Regeneration Therapies

    DEFF Research Database (Denmark)

    Gyöngyösi, Mariann; Wojakowski, Wojciech; Navarese, Eliano P

    2016-01-01

    In contrast to multiple publication-based meta-analyses involving clinical cardiac regeneration therapy in patients with recent myocardial infarction, a recently published meta-analysis based on individual patient data reported no effect of cell therapy on left ventricular function or clinical...

  15. A WEB-BASED SOLUTION TO VISUALIZE OPERATIONAL MONITORING LINUX CLUSTER FOR THE PROTODUNE DATA QUALITY MONITORING CLUSTER

    CERN Document Server

    Mosesane, Badisa

    2017-01-01

    The Neutrino computing cluster made of 300 Dell PowerEdge 1950 U1 nodes serves an integral role to the CERN Neutrino Platform (CENF). It represents an effort to foster fundamental research in the field of Neutrino physics as it provides data processing facility. We cannot begin to over emphasize the need for data quality monitoring coupled with automating system configurations and remote monitoring of the cluster. To achieve these, a software stack has been chosen to implement automatic propagation of configurations across all the nodes in the cluster. The bulk of these discusses and delves more into the automated configuration management system on this cluster to enable the fast online data processing and Data Quality (DQM) process for the Neutrino Platform cluster (npcmp.cern.ch).

  16. Normalized mutual information based PET-MR registration using K-Means clustering and shading correction

    NARCIS (Netherlands)

    Knops, Z.F.; Maintz, J.B.A.; Viergever, M.A.; Pluim, J.P.W.; Gee, J.C.; Maintz, J.B.A.; Vannier, M.W.

    2003-01-01

    A method for the efficient re-binning and shading based correction of intensity distributions of the images prior to normalized mutual information based registration is presented. Our intensity distribution re-binning method is based on the K-means clustering algorithm as opposed to the generally

  17. Comparison and combination of "direct" and fragment based local correlation methods: Cluster in molecules and domain based local pair natural orbital perturbation and coupled cluster theories

    Science.gov (United States)

    Guo, Yang; Becker, Ute; Neese, Frank

    2018-03-01

    Local correlation theories have been developed in two main flavors: (1) "direct" local correlation methods apply local approximation to the canonical equations and (2) fragment based methods reconstruct the correlation energy from a series of smaller calculations on subsystems. The present work serves two purposes. First, we investigate the relative efficiencies of the two approaches using the domain-based local pair natural orbital (DLPNO) approach as the "direct" method and the cluster in molecule (CIM) approach as the fragment based approach. Both approaches are applied in conjunction with second-order many-body perturbation theory (MP2) as well as coupled-cluster theory with single-, double- and perturbative triple excitations [CCSD(T)]. Second, we have investigated the possible merits of combining the two approaches by performing CIM calculations with DLPNO methods serving as the method of choice for performing the subsystem calculations. Our cluster-in-molecule approach is closely related to but slightly deviates from approaches in the literature since we have avoided real space cutoffs. Moreover, the neglected distant pair correlations in the previous CIM approach are considered approximately. Six very large molecules (503-2380 atoms) were studied. At both MP2 and CCSD(T) levels of theory, the CIM and DLPNO methods show similar efficiency. However, DLPNO methods are more accurate for 3-dimensional systems. While we have found only little incentive for the combination of CIM with DLPNO-MP2, the situation is different for CIM-DLPNO-CCSD(T). This combination is attractive because (1) the better parallelization opportunities offered by CIM; (2) the methodology is less memory intensive than the genuine DLPNO-CCSD(T) method and, hence, allows for large calculations on more modest hardware; and (3) the methodology is applicable and efficient in the frequently met cases, where the largest subsystem calculation is too large for the canonical CCSD(T) method.

  18. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time.

    Science.gov (United States)

    Cai, Yunpeng; Sun, Yijun

    2011-08-01

    Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm.

  19. Analyser-based phase contrast image reconstruction using geometrical optics

    International Nuclear Information System (INIS)

    Kitchen, M J; Pavlov, K M; Siu, K K W; Menk, R H; Tromba, G; Lewis, R A

    2007-01-01

    Analyser-based phase contrast imaging can provide radiographs of exceptional contrast at high resolution (<100 μm), whilst quantitative phase and attenuation information can be extracted using just two images when the approximations of geometrical optics are satisfied. Analytical phase retrieval can be performed by fitting the analyser rocking curve with a symmetric Pearson type VII function. The Pearson VII function provided at least a 10% better fit to experimentally measured rocking curves than linear or Gaussian functions. A test phantom, a hollow nylon cylinder, was imaged at 20 keV using a Si(1 1 1) analyser at the ELETTRA synchrotron radiation facility. Our phase retrieval method yielded a more accurate object reconstruction than methods based on a linear fit to the rocking curve. Where reconstructions failed to map expected values, calculations of the Takagi number permitted distinction between the violation of the geometrical optics conditions and the failure of curve fitting procedures. The need for synchronized object/detector translation stages was removed by using a large, divergent beam and imaging the object in segments. Our image acquisition and reconstruction procedure enables quantitative phase retrieval for systems with a divergent source and accounts for imperfections in the analyser

  20. Estimation of cluster stability using the theory of electron density functional

    International Nuclear Information System (INIS)

    Borisov, Yu.A.

    1985-01-01

    Prospects of using simple versions of the electron density functional for studying the energy characteristics of cluster compounds Was discussed. These types of cluster compounds were considered: clusters of Cs, Be, B, Sr, Cd, Sc, In, V, Tl, I elements as intermediate form between molecule and solid body, metalloorganic Mo, W, Tc, Re, Rn clusters and elementoorganic compounds of nido-cluster type. The problem concerning changes in the binding energy of homoatomic clusters depending on their size and three-dimensional structure was analysed

  1. Capabilities of R Package mixAK for Clustering Based on Multivariate Continuous and Discrete Longitudinal Data

    Directory of Open Access Journals (Sweden)

    Arnošt Komárek

    2014-09-01

    Full Text Available R package mixAK originally implemented routines primarily for Bayesian estimation of finite normal mixture models for possibly interval-censored data. The functionality of the package was considerably enhanced by implementing methods for Bayesian estimation of mixtures of multivariate generalized linear mixed models proposed in Komrek and Komrkov (2013. Among other things, this allows for a cluster analysis (classification based on multivariate continuous and discrete longitudinal data that arise whenever multiple outcomes of a different nature are recorded in a longitudinal study. This package also allows for a data-driven selection of a number of clusters as methods for selecting a number of mixture components were implemented. A model and clustering methodology for multivariate continuous and discrete longitudinal data is overviewed. Further, a step-by-step cluster analysis based jointly on three longitudinal variables of different types (continuous, count, dichotomous is given, which provides a user manual for using the package for similar problems.

  2. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  3. Model-based Clustering of High-Dimensional Data in Astrophysics

    Science.gov (United States)

    Bouveyron, C.

    2016-05-01

    The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the "small n / large p" situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.

  4. Feature selection for anomaly–based network intrusion detection using cluster validity indices

    CSIR Research Space (South Africa)

    Naidoo, T

    2015-09-01

    Full Text Available for Anomaly–Based Network Intrusion Detection Using Cluster Validity Indices Tyrone Naidoo_, Jules–Raymond Tapamoy, Andre McDonald_ Modelling and Digital Science, Council for Scientific and Industrial Research, South Africa 1tnaidoo2@csir.co.za 3...

  5. Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

    Science.gov (United States)

    Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

    2018-04-24

    Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed

  6. Quantum annealing for combinatorial clustering

    Science.gov (United States)

    Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey; Dulny, Joseph

    2018-02-01

    Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver "qbsolv." The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

  7. Improvement of perinatal and newborn care in rural Pakistan through community-based strategies: a cluster-randomised effectiveness trial.

    Science.gov (United States)

    Bhutta, Zulfiqar A; Soofi, Sajid; Cousens, Simon; Mohammad, Shah; Memon, Zahid A; Ali, Imran; Feroze, Asher; Raza, Farrukh; Khan, Amanullah; Wall, Steve; Martines, Jose

    2011-01-29

    Newborn deaths account for 57% of deaths in children younger than 5 years in Pakistan. Although a large programme of trained lady health workers (LHWs) exists, the effectiveness of this training on newborn outcomes has not been studied. We aimed to evaluate the effectiveness of a community-based intervention package, principally delivered through LHWs working with traditional birth attendants and community health committees, for reduction of perinatal and neonatal mortality in a rural district of Pakistan. We undertook a cluster randomised trial between February, 2006, and March, 2008, in Hala and Matiari subdistricts, Pakistan. Catchment areas of primary care facilities and all affiliated LHWs were used to define clusters, which were allocated to intervention and control groups by restricted, stratified randomisation. The intervention package delivered by LHWs through group sessions consisted of promotion of antenatal care and maternal health education, use of clean delivery kits, facility births, immediate newborn care, identification of danger signs, and promotion of careseeking; control clusters received routine care. Independent data collectors undertook quarterly household surveillance to capture data for births, deaths, and household practices related to maternal and newborn care. Data collectors were masked to cluster allocation; those analysing data were not. The primary outcome was perinatal and all-cause neonatal mortality. Analysis was by intention to treat. This trial is registered, ISRCTN16247511. 16 clusters were assigned to intervention (23,353 households, 12,391 total births) and control groups (23,768 households, 11,443 total births). LHWs in the intervention clusters were able to undertake 4428 (63%) of 7084 planned group sessions, but were only able to visit 2943 neonates (24%) of a total 12,028 livebirths in their catchment villages. Stillbirths were reduced in intervention clusters (39·1 stillbirths per 1000 total births) compared with

  8. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. K2: A NEW METHOD FOR THE DETECTION OF GALAXY CLUSTERS BASED ON CANADA-FRANCE-HAWAII TELESCOPE LEGACY SURVEY MULTICOLOR IMAGES

    International Nuclear Information System (INIS)

    Thanjavur, Karun; Willis, Jon; Crampton, David

    2009-01-01

    We have developed a new method, K2, optimized for the detection of galaxy clusters in multicolor images. Based on the Red Sequence approach, K2 detects clusters using simultaneous enhancements in both colors and position. The detection significance is robustly determined through extensive Monte Carlo simulations and through comparison with available cluster catalogs based on two different optical methods, and also on X-ray data. K2 also provides quantitative estimates of the candidate clusters' richness and photometric redshifts. Initially, K2 was applied to the two color (gri) 161 deg 2 images of the Canada-France-Hawaii Telescope Legacy Survey Wide (CFHTLS-W) data. Our simulations show that the false detection rate for these data, at our selected threshold, is only ∼1%, and that the cluster catalogs are ∼80% complete up to a redshift of z = 0.6 for Fornax-like and richer clusters and to z ∼ 0.3 for poorer clusters. Based on the g-, r-, and i-band photometric catalogs of the Terapix T05 release, 35 clusters/deg 2 are detected, with 1-2 Fornax-like or richer clusters every 2 deg 2 . Catalogs containing data for 6144 galaxy clusters have been prepared, of which 239 are rich clusters. These clusters, especially the latter, are being searched for gravitational lenses-one of our chief motivations for cluster detection in CFHTLS. The K2 method can be easily extended to use additional color information and thus improve overall cluster detection to higher redshifts. The complete set of K2 cluster catalogs, along with the supplementary catalogs for the member galaxies, are available on request from the authors.

  10. Clustering gene expression data based on predicted differential effects of GV interaction.

    Science.gov (United States)

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  11. Management of cluster headache

    DEFF Research Database (Denmark)

    Tfelt-Hansen, Peer C; Jensen, Rigmor H

    2012-01-01

    The prevalence of cluster headache is 0.1% and cluster headache is often not diagnosed or misdiagnosed as migraine or sinusitis. In cluster headache there is often a considerable diagnostic delay - an average of 7 years in a population-based survey. Cluster headache is characterized by very severe...... or severe orbital or periorbital pain with a duration of 15-180 minutes. The cluster headache attacks are accompanied by characteristic associated unilateral symptoms such as tearing, nasal congestion and/or rhinorrhoea, eyelid oedema, miosis and/or ptosis. In addition, there is a sense of restlessness...... and agitation. Patients may have up to eight attacks per day. Episodic cluster headache (ECH) occurs in clusters of weeks to months duration, whereas chronic cluster headache (CCH) attacks occur for more than 1 year without remissions. Management of cluster headache is divided into acute attack treatment...

  12. Mesoscale structure of a morning sector ionospheric shear flow region determined by conjugate Cluster II and MIRACLE ground-based observations

    Directory of Open Access Journals (Sweden)

    O. Amm

    Full Text Available We analyse a conjunction event of the Cluster II spacecraft with the MIRACLE ground-based instrument net-work in northern Fennoscandia on 6 February 2001, between 23:00 and 00:00 UT. Shortly after the spacecraft were located at perigee, the Cluster II satellites’ magnetic footpoints move northwards over Scandinavia and Svalbard, almost perfectly aligned with the central chain of the IMAGE magnetometer network, and cross a morning sector ionospheric shear zone during this passage. In this study we focus on the mesoscale structure of the ionosphere. Ionospheric conductances, true horizontal currents, and field-aligned currents (FAC are calculated from the ground-based measurements of the IMAGE magnetometers and the STARE coherent scatter radar, using the 1-D method of characteristics. An excellent agreement between these results and the FAC observed by Cluster II is reached after averaging the Cluster measurements to mesoscales, as well as between the location of the convection reversal boundary (CRB, as observed by STARE and by the Cluster II EFW instrument. A sheet of downward FAC is observed in the vicinity of the CRB, which is mainly caused by the positive divergence of the electric field there. This FAC sheet is detached by 0.5°–2° of latitude from a more equatorward downward FAC sheet at the poleward flank of the westward electrojet. This latter FAC sheet, as well as the upward FAC at the equatorward flank of the jet, are mainly caused by meridional gradients in the ionospheric conductances, which reach up to 25 S in the electrojet region, but only ~ 5 S poleward of it, with a minimum at the CRB. Particle measurements show that the major part of the downward FAC is carried by upward flowing electrons, and only a small part by downward flowing ions. The open-closed field line boundary is found to be located 3°–4° poleward of the CRB, implying significant errors if the latter is used as a proxy of the former.

    Key words

  13. Mesoscale structure of a morning sector ionospheric shear flow region determined by conjugate Cluster II and MIRACLE ground-based observations

    Directory of Open Access Journals (Sweden)

    O. Amm

    2003-08-01

    Full Text Available We analyse a conjunction event of the Cluster II spacecraft with the MIRACLE ground-based instrument net-work in northern Fennoscandia on 6 February 2001, between 23:00 and 00:00 UT. Shortly after the spacecraft were located at perigee, the Cluster II satellites’ magnetic footpoints move northwards over Scandinavia and Svalbard, almost perfectly aligned with the central chain of the IMAGE magnetometer network, and cross a morning sector ionospheric shear zone during this passage. In this study we focus on the mesoscale structure of the ionosphere. Ionospheric conductances, true horizontal currents, and field-aligned currents (FAC are calculated from the ground-based measurements of the IMAGE magnetometers and the STARE coherent scatter radar, using the 1-D method of characteristics. An excellent agreement between these results and the FAC observed by Cluster II is reached after averaging the Cluster measurements to mesoscales, as well as between the location of the convection reversal boundary (CRB, as observed by STARE and by the Cluster II EFW instrument. A sheet of downward FAC is observed in the vicinity of the CRB, which is mainly caused by the positive divergence of the electric field there. This FAC sheet is detached by 0.5°–2° of latitude from a more equatorward downward FAC sheet at the poleward flank of the westward electrojet. This latter FAC sheet, as well as the upward FAC at the equatorward flank of the jet, are mainly caused by meridional gradients in the ionospheric conductances, which reach up to 25 S in the electrojet region, but only ~ 5 S poleward of it, with a minimum at the CRB. Particle measurements show that the major part of the downward FAC is carried by upward flowing electrons, and only a small part by downward flowing ions. The open-closed field line boundary is found to be located 3°–4° poleward of the CRB, implying significant errors if the latter is used as a proxy of the former.Key words. Ionosphere

  14. A Trajectory Regression Clustering Technique Combining a Novel Fuzzy C-Means Clustering Algorithm with the Least Squares Method

    Directory of Open Access Journals (Sweden)

    Xiangbing Zhou

    2018-04-01

    Full Text Available Rapidly growing GPS (Global Positioning System trajectories hide much valuable information, such as city road planning, urban travel demand, and population migration. In order to mine the hidden information and to capture better clustering results, a trajectory regression clustering method (an unsupervised trajectory clustering method is proposed to reduce local information loss of the trajectory and to avoid getting stuck in the local optimum. Using this method, we first define our new concept of trajectory clustering and construct a novel partitioning (angle-based partitioning method of line segments; second, the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means (FCM clustering, which are used to maintain the stability and the robustness of the clustering process; finally, least squares regression model is employed to achieve regression clustering of the trajectory. In our experiment, the performance and effectiveness of our method is validated against real-world taxi GPS data. When comparing our clustering algorithm with the partition-based clustering algorithms (K-means, K-median, and FCM, our experimental results demonstrate that the presented method is more effective and generates a more reasonable trajectory.

  15. Research on retailer data clustering algorithm based on Spark

    Science.gov (United States)

    Huang, Qiuman; Zhou, Feng

    2017-03-01

    Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.

  16. Direct Reconstruction of CT-based Attenuation Correction Images for PET with Cluster-Based Penalties

    Science.gov (United States)

    Kim, Soo Mee; Alessio, Adam M.; De Man, Bruno; Asma, Evren; Kinahan, Paul E.

    2015-01-01

    Extremely low-dose CT acquisitions for the purpose of PET attenuation correction will have a high level of noise and biasing artifacts due to factors such as photon starvation. This work explores a priori knowledge appropriate for CT iterative image reconstruction for PET attenuation correction. We investigate the maximum a posteriori (MAP) framework with cluster-based, multinomial priors for the direct reconstruction of the PET attenuation map. The objective function for direct iterative attenuation map reconstruction was modeled as a Poisson log-likelihood with prior terms consisting of quadratic (Q) and mixture (M) distributions. The attenuation map is assumed to have values in 4 clusters: air+background, lung, soft tissue, and bone. Under this assumption, the MP was a mixture probability density function consisting of one exponential and three Gaussian distributions. The relative proportion of each cluster was jointly estimated during each voxel update of direct iterative coordinate decent (dICD) method. Noise-free data were generated from NCAT phantom and Poisson noise was added. Reconstruction with FBP (ramp filter) was performed on the noise-free (ground truth) and noisy data. For the noisy data, dICD reconstruction was performed with the combination of different prior strength parameters (β and γ) of Q- and M-penalties. The combined quadratic and mixture penalties reduces the RMSE by 18.7% compared to post-smoothed iterative reconstruction and only 0.7% compared to quadratic alone. For direct PET attenuation map reconstruction from ultra-low dose CT acquisitions, the combination of quadratic and mixture priors offers regularization of both variance and bias and is a potential method to derive attenuation maps with negligible patient dose. However, the small improvement in quantitative accuracy relative to the substantial increase in algorithm complexity does not currently justify the use of mixture-based PET attenuation priors for reconstruction of CT

  17. Electron impact ionization of large krypton clusters

    Institute of Scientific and Technical Information of China (English)

    Li Shao-Hui; Li Ru-Xin; Ni Guo-Quan; Xu Zhi-Zhan

    2004-01-01

    We show that the detection of ionization of very large van der Waals clusters in a pulsed jet or a beam can be realized by using a fast ion gauge. Rapid positive feedback electron impact ionization and fragmentation processes,which are initially ignited by electron impact ionization of the krypton clusters with the electron current of the ion gauge, result in the appearance of a progressional oscillation-like ion spectrum, or just of a single fast event under critical conditions. Each line in the spectrum represents a correlated explosion or avalanche ionization of the clusters.The phenomena have been analysed qualitatively along with a Rayleigh scattering experiment of the corresponding cluster jet.

  18. Simultaneous gains tuning in boiler/turbine PID-based controller clusters using iterative feedback tuning methodology.

    Science.gov (United States)

    Zhang, Shu; Taft, Cyrus W; Bentsman, Joseph; Hussey, Aaron; Petrus, Bryan

    2012-09-01

    Tuning a complex multi-loop PID based control system requires considerable experience. In today's power industry the number of available qualified tuners is dwindling and there is a great need for better tuning tools to maintain and improve the performance of complex multivariable processes. Multi-loop PID tuning is the procedure for the online tuning of a cluster of PID controllers operating in a closed loop with a multivariable process. This paper presents the first application of the simultaneous tuning technique to the multi-input-multi-output (MIMO) PID based nonlinear controller in the power plant control context, with the closed-loop system consisting of a MIMO nonlinear boiler/turbine model and a nonlinear cluster of six PID-type controllers. Although simplified, the dynamics and cross-coupling of the process and the PID cluster are similar to those used in a real power plant. The particular technique selected, iterative feedback tuning (IFT), utilizes the linearized version of the PID cluster for signal conditioning, but the data collection and tuning is carried out on the full nonlinear closed-loop system. Based on the figure of merit for the control system performance, the IFT is shown to deliver performance favorably comparable to that attained through the empirical tuning carried out by an experienced control engineer. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.

  19. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yihang Yin

    2015-08-01

    Full Text Available Wireless sensor networks (WSNs have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA. First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.

  20. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

    Science.gov (United States)

    Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

    2015-08-07

    Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.