WorldWideScience

Sample records for temporal clustering analysis

  1. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  2. Application of Geostatistical Methods and Machine Learning for spatio-temporal Earthquake Cluster Analysis

    Science.gov (United States)

    Schaefer, A. M.; Daniell, J. E.; Wenzel, F.

    2014-12-01

    Earthquake clustering tends to be an increasingly important part of general earthquake research especially in terms of seismic hazard assessment and earthquake forecasting and prediction approaches. The distinct identification and definition of foreshocks, aftershocks, mainshocks and secondary mainshocks is taken into account using a point based spatio-temporal clustering algorithm originating from the field of classic machine learning. This can be further applied for declustering purposes to separate background seismicity from triggered seismicity. The results are interpreted and processed to assemble 3D-(x,y,t) earthquake clustering maps which are based on smoothed seismicity records in space and time. In addition, multi-dimensional Gaussian functions are used to capture clustering parameters for spatial distribution and dominant orientations. Clusters are further processed using methodologies originating from geostatistics, which have been mostly applied and developed in mining projects during the last decades. A 2.5D variogram analysis is applied to identify spatio-temporal homogeneity in terms of earthquake density and energy output. The results are mitigated using Kriging to provide an accurate mapping solution for clustering features. As a case study, seismic data of New Zealand and the United States is used, covering events since the 1950s, from which an earthquake cluster catalogue is assembled for most of the major events, including a detailed analysis of the Landers and Christchurch sequences.

  3. a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

    Science.gov (United States)

    Huang, W.; Li, S.; Xu, S.

    2016-06-01

    How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the

  4. A THREE-STEP SPATIAL-TEMPORAL-SEMANTIC CLUSTERING METHOD FOR HUMAN ACTIVITY PATTERN ANALYSIS

    Directory of Open Access Journals (Sweden)

    W. Huang

    2016-06-01

    Full Text Available How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time to four dimensions (space, time and semantics. More specifically, not only a location and time that people stay and spend are collected, but also what people “say” for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The

  5. Spatio-temporal clustering analysis and its determinants of hand, foot and mouth disease in Hunan, China, 2009-2015.

    Science.gov (United States)

    Wu, Xinrui; Hu, Shixiong; Kwaku, Abuaku Benjamin; Li, Qi; Luo, Kaiwei; Zhou, Ying; Tan, Hongzhuan

    2017-09-25

    Hand, foot and mouth disease (HFMD) is one of the highest reported infectious diseases with several outbreaks across the world. This study aimed at describing epidemiological characteristics, investigating spatio-temporal clustering changes, and identifying determinant factors in different clustering areas of HFMD. Descriptive statistics was used to evaluate the epidemic characteristics of HFMD from 2009 to 2015. Spatial autocorrelation and spatio-temporal cluster analysis were used to explore the spatial temporal patterns. An autologistic regression model was employed to explore determinants of HFMD clustering. The incidence rates of HFMD ranged from 54.31/10 million to 318.06/10 million between 2009 and 2015 in Hunan. Cases were mainly prevalent in children aged 5 years and even younger, with an average male-to-female sex ratio of 1.66, and two epidemic periods in each year. Clustering areas gathered in the northern regions in 2009 and in the central regions from 2010 to 2012. They moved to central-southern regions in 2013 and 2014 and central-western regions in 2015. The significant risk factors of HFMD clusters were rainfall (OR = 2.187), temperature (OR = 4.329) and humidity (OR = 2.070). The protect factor was wind speed (OR = 0.258). The HFMD incidence from 2009 to 2015 in Hunan showed a new spatiotemporal clustering tendency, with the shifting trend of clustering areas toward south and west. Meteorological factors showed a strong association with HFMD clustering, which may assist in predicting future spatial-temporal clusters.

  6. Spatial-temporal heterogeneity of land subsidence evolution in Beijing based on InSAR and cluster analysis

    Science.gov (United States)

    Ke, Y.; Li, Y.; Gong, H.; Pan, Y.; Zhu, L.; Chen, B.

    2015-12-01

    Land subsidence is a common natural hazard occurring in extensive areas in the world. In Beijing, the capital city of China, there has been serious land subsidence due to overexploitation of ground water during the recent decades. Five major subsidence tunnels have formed. Across the Beijing plain area, the ground is sinking at the rate of 30-100mm/year. Uneven subsidence leads to ground fissure and building destruction, and has caused great economical and property loss. To better characterize and understand regional land subsidence evolution, it is critical to monitor the time-series dynamics of subsidence, and capture the spatial-temporal heterogeneity of the subsidence evolution. Interferometric SAR technique, as it provides high spatial resolution and wide range of observation, have been successfully used to monitor regional ground deformation. The objective of this study is to derive time-series regional land subsidence dynamics in Beijing, and based on which, analyze and assess the spatial-temporal heterogeneity of the evolution using cluster analysis. First, ENVISAT ASAR (2003-2009 years, 28 scenes, track number: 218) datasets during 2003-2010 covering Beijing plain area were utilized to obtain time-series subsidence rate using Persistent Scatter InSAR (PS-InSAR) technique provided in SARProz software. Second, time-series subsidence characteristics of the PS points were analyzed and the PS points were clustered based on Self-Organization feature Maps (SOM) algorithm considering environmental factors such as groundwater level and lithologic characters. This study demonstrates that based on InSAR measurements and SOMs algorithm, the spatial-temporal heterogeneity of land subsidence evolution can be captured. Each cluster shows unique spatial-temporal evolution pattern. The results of this study will facilitate further land subsidence modeling and prediction at regional spatial scale.

  7. Clustering analysis

    International Nuclear Information System (INIS)

    Romli

    1997-01-01

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  8. Cluster analysis

    CERN Document Server

    Everitt, Brian S; Leese, Morven; Stahl, Daniel

    2011-01-01

    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  9. Spatial-Temporal Clustering of Tornadoes

    Science.gov (United States)

    Malamud, Bruce D.; Turcotte, Donald L.; Brooks, Harold E.

    2017-04-01

    The standard measure of the intensity of a tornado is the Enhanced Fujita scale, which is based qualitatively on the damage caused by a tornado. An alternative measure of tornado intensity is the tornado path length, L. Here we examine the spatial-temporal clustering of severe tornadoes, which we define as having path lengths L ≥ 10 km. Of particular concern are tornado outbreaks, when a large number of severe tornadoes occur in a day in a restricted region. We apply a spatial-temporal clustering analysis developed for earthquakes. We take all pairs of severe tornadoes in observed and modelled outbreaks, and for each pair plot the spatial lag (distance between touchdown points) against the temporal lag (time between touchdown points). We apply our spatial-temporal lag methodology to the intense tornado outbreaks in the central United States on 26 and 27 April 2011, which resulted in over 300 fatalities and produced 109 severe (L ≥ 10 km) tornadoes. The patterns of spatial-temporal lag correlations that we obtain for the 2 days are strikingly different. On 26 April 2011, there were 45 severe tornadoes and our clustering analysis is dominated by a complex sequence of linear features. We associate the linear patterns with the tornadoes generated in either a single cell thunderstorm or a closely spaced cluster of single cell thunderstorms moving at a near-constant velocity. Our study of a derecho tornado outbreak of six severe tornadoes on 4 April 2011 along with modelled outbreak scenarios confirms this association. On 27 April 2011, there were 64 severe tornadoes and our clustering analysis is predominantly random with virtually no embedded linear patterns. We associate this pattern with a large number of interacting supercell thunderstorms generating tornadoes randomly in space and time. In order to better understand these associations, we also applied our approach to the Great Plains tornado outbreak of 3 May 1999. Careful studies by others have associated

  10. Seasonality and temporal clustering of Kawasaki syndrome.

    Science.gov (United States)

    Burns, Jane C; Cayan, Daniel R; Tong, Garrick; Bainto, Emelia V; Turner, Christena L; Shike, Hiroko; Kawasaki, Tomisaku; Nakamura, Yosikazu; Yashiro, Mayumi; Yanagawa, Hiroshi

    2005-03-01

    The distribution of a syndrome in space and time may suggest clues to its etiology. The cause of Kawasaki syndrome, a systemic vasculitis of infants and children, is unknown, but an infectious etiology is suspected. Seasonality and clustering of Kawasaki syndrome cases were studied in Japanese children with Kawasaki syndrome reported in nationwide surveys in Japan. Excluding the years that contained the 3 major nationwide epidemics, 84,829 cases during a 14-year period (1987-2000) were analyzed. To assess seasonality, we calculated mean monthly incidence during the study period for eastern and western Japan and for each of the 47 prefectures. To assess clustering, we compared the number of cases per day (daily incidence) with a simulated distribution (Monte Carlo analysis). Marked spatial and temporal patterns were noted in both the seasonality and deviations from the average number of Kawasaki syndrome cases in Japan. Seasonality was bimodal with peaks in January and June/July and a nadir in October. This pattern was consistent throughout Japan and during the entire 14-year period. Some years produced very high or low numbers of cases, but the overall variability was consistent throughout the entire country. Temporal clustering of Kawasaki syndrome cases was detected with nationwide outbreaks. Kawasaki syndrome has a pronounced seasonality in Japan that is consistent throughout the length of the Japanese archipelago. Temporal clustering of cases combined with marked seasonality suggests an environmental trigger for this clinical syndrome.

  11. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Mato Grosso do Sul State, Brazil

    Science.gov (United States)

    Teodoro, Paulo Eduardo; de Oliveira-Júnior, José Francisco; da Cunha, Elias Rodrigues; Correa, Caio Cezar Guedes; Torres, Francisco Eduardo; Bacani, Vitor Matheus; Gois, Givanildo; Ribeiro, Larissa Pereira

    2016-04-01

    The State of Mato Grosso do Sul (MS) located in Brazil Midwest is devoid of climatological studies, mainly in the characterization of rainfall regime and producers' meteorological systems and rain inhibitors. This state has different soil and climatic characteristics distributed among three biomes: Cerrado, Atlantic Forest and Pantanal. This study aimed to apply the cluster analysis using Ward's algorithm and identify those meteorological systems that affect the rainfall regime in the biomes. The rainfall data of 32 stations (sites) of the MS State were obtained from the Agência Nacional de Águas (ANA) database, collected from 1954 to 2013. In each of the 384 monthly rainfall temporal series was calculated the average and applied the Ward's algorithm to identify spatial and temporal variability of rainfall. Bartlett's test revealed only in January homogeneous variance at all sites. Run test showed that there was no increase or decrease in trend of monthly rainfall. Cluster analysis identified five rainfall homogeneous regions in the MS State, followed by three seasons (rainy, transitional and dry). The rainy season occurs during the months of November, December, January, February and March. The transitional season ranges between the months of April and May, September and October. The dry season occurs in June, July and August. The groups G1, G4 and G5 are influenced by South Atlantic Subtropical Anticyclone (SASA), Chaco's Low (CL), Bolivia's High (BH), Low Levels Jet (LLJ) and South Atlantic Convergence Zone (SACZ) and Maden-Julian Oscillation (MJO). Group G2 is influenced by Upper Tropospheric Cyclonic Vortex (UTCV) and Front Systems (FS). The group G3 is affected by UTCV, FS and SACZ. The meteorological systems' interaction that operates in each biome and the altitude causes the rainfall spatial and temporal diversity in MS State.

  12. [Temporal-spatial scan clustering analysis on hand-foot-mouth disease in Zhejiang province, 2008-2013].

    Science.gov (United States)

    Cai, Jian; Chen, Enfu; Gu, Hua; Chai, Chengliang; Fu, Guiming; Wang, Xiaoxiao

    2014-06-01

    To understand the temporal-spatial distribution of hand-foot-mouth disease in Zhejiang province, from May 2008 to June 2013. The cases number and incidence data of hand-foot-mouth disease from May 2008 to June 2013 for all the counties(cities, districts) in Zhejiang province were collected from China Information System for Disease Control and Prevention, total 511 643 cases. Temporal distribution of hand-foot-mouth disease was described, the incidence maps were drawn using Epimap software. Temporal-spatial clustering was analyzed by Satscan 9.0.1 software.Log likelihood ratio(LLR) was used to assess the clustering. The year-county (city, district)-specific relative risk(RR) of hand-foot-mouth disease were calculated. RR contour maps were drawn with Arcview GIS 3.3. In Zhejiang province, from May 2008 to June 2013, the highest incidence rate was 270.81/100 000 (147 943/54 629 996) (2012 year) and the lowest incidence rate was 135.32/100 000 (69 285/51 199 987) (2009 year). The incidence in the eastern coastal areas (217.77/100 000(286 300/131 468 746)) including Ningbo, Taizhou, Wenzhou, was higher than the western mountain areas(168.11/100 000(98 016/58 304 266)) including Quzhou, Lishui, Jinhua. The epidemic curve showed two peaks, during April to July (101.15/100 000(320 144/316 497 516)) , and during October to November (23.30/100 000 (61 088/262 148 114)) . of temporal-spatial scan showed 10 temporal spatial aggregation areas, the strongest one was in Wenzhou city, south-east Zhejiang province, from July 2009 to June 2011(RR = 2.38, LLR = 10 650.75, P hand-foot-mouth disease in Zhejiang province, 2008-2013, with the peak during April to July. Temporal-spatial clustering were observed, the disease showed a distinct regional distribution feature, eastern coastal cluster areas and mid-west cluster areas were found.

  13. Ananke: temporal clustering reveals ecological dynamics of microbial communities

    Directory of Open Access Journals (Sweden)

    Michael W. Hall

    2017-09-01

    Full Text Available Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.

  14. Cluster Oriented Spatio Temporal Multidimensional Data Visualization of Earthquakes in Indonesia

    Directory of Open Access Journals (Sweden)

    Mohammad Nur Shodiq

    2016-03-01

    Full Text Available Spatio temporal data clustering is challenge task. The result of clustering data are utilized to investigate the seismic parameters. Seismic parameters are used to describe the characteristics of earthquake behavior. One of the effective technique to study multidimensional spatio temporal data is visualization. But, visualization of multidimensional data is complicated problem. Because, this analysis consists of observed data cluster and seismic parameters. In this paper, we propose a visualization system, called as IES (Indonesia Earthquake System, for cluster analysis, spatio temporal analysis, and visualize the multidimensional data of seismic parameters. We analyze the cluster analysis by using automatic clustering, that consists of get optimal number of cluster and Hierarchical K-means clustering. We explore the visual cluster and multidimensional data in low dimensional space visualization. We made experiment with observed data, that consists of seismic data around Indonesian archipelago during 2004 to 2014. Keywords: Clustering, visualization, multidimensional data, seismic parameters.

  15. Temporal variation of aftershocks by means of multifractal characterization of their inter-event time and cluster analysis

    Science.gov (United States)

    Figueroa-Soto, A.; Zuñiga, R.; Marquez-Ramirez, V.; Monterrubio-Velasco, M.

    2017-12-01

    . The inter-event time characteristics of seismic aftershock sequences can provide important information to discern stages in the aftershock generation process. In order to investigate whether separate dynamic stages can be identified, (1) aftershock series after selected earthquake mainshocks, which took place at similar tectonic regimes were analyzed. To this end we selected two well-defined aftershock sequences from New Zealand and one aftershock sequence for Mexico, we (2) analyzed the fractal behavior of the logarithm of inter-event times (also called waiting times) of aftershocks by means of Holdeŕs exponent, and (3) their magnitude and spatial location based on a methodology proposed by Zaliapin and Ben Zion [2011] which accounts for the clustering properties of the sequence. In general, more than two coherent process stages can be identified following the main rupture, evidencing a type of "cascade" process which precludes implying a single generalized power law even though the temporal rate and average fractal character appear to be unique (as in a single Omorís p value). We found that aftershock processes indeed show multi-fractal characteristics, which may be related to different stages in the process of diffusion, as seen in the temporary-spatial distribution of aftershocks. Our method provides a way of defining the onset of the return to seismic background activity and the end of the main aftershock sequence.

  16. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  17. Spatial-temporal cluster analysis of mortality from road traffic injuries using geographic information systems in West of Iran during 2009-2014.

    Science.gov (United States)

    Zangeneh, Alireza; Najafi, Farid; Karimi, Saeed; Saeidi, Shahram; Izadi, Neda

    2018-02-08

    Road traffic injuries (RTIs) are considered as one of the most important health problems endangering people's life. The examination of the geographical distribution of RTIs could help policymakers in better planning to reduce RTIs. This study, therefore, aimed to determine the spatial-temporal clustering of mortality from RTIs in West of Iran. Deaths from RTIs, registered in Forensic Medicine Organization of Kermanshah province over a period of six years (2009-2014), were used. Using negative binomial regression, the mortality trend was investigated. In order to investigate the spatial distribution of RTIs, we used ArcGIS. (Version 10.3). The median age of the 3231 people died in RTIs was 37 (IQR = 31) year, 78.4% were male. The 6-year average mortality rate from RTIs was 27.8/100,000 deaths, and the average rate had a declining trend. The dispersion of RTIs showed that most deaths occurred in Kermanshah, Islamabad, Bisotun, and Harsin road axes, respectively. The mean center of all deaths from RTIs occurred in Kermanshah province, the central area of Kermanshah district. The spatial trend of such deaths has moved to the northeast-southwest, and such deaths were geographically centralized. Results of Moran's I with respect to cluster analysis also indicated positive spatial autocorrelations. The results showed that the mortality rate from RTIs, despite the decline in recent years, is still high when compared with other countries. The clustering of accidents raises the concern that road infrastructure in certain locations may also be a factor. Regarding the results related to the temporal analysis, it is suggested that the enforcement of traffic rules be stricter at rush hours. Copyright © 2018 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  18. A temporal analysis of the spatial clustering of food outlets around schools in Christchurch, New Zealand, 1966 to 2006.

    Science.gov (United States)

    Day, Peter L; Pearce, Jamie R; Pearson, Amber L

    2015-01-01

    To explore changes in urban food environments near schools, as potential contributors to the rising prevalence of overweight and obesity among children. Addresses of food premises and schools in 1966, 1976, 1986, 1996 and 2006 were geo-coded. For each year, the number and proportion of outlets by category (supermarket/grocery; convenience; fast-food outlet) within 800 m of schools were calculated. The degree of spatial clustering of outlets was assessed using a bivariate K-function analysis. Food outlet categories, school level and school social deprivation quintiles were compared. Christchurch, New Zealand. All schools and food outlets at 10-year snapshots from 1966 to 2006. Between 1966 and 2006, the median number of supermarkets/grocery stores within 800 m of schools decreased from 5 to 1, convenience stores decreased from 2 to 1, and fast-food outlets increased from 1 to 4. The ratio of fast-food outlets to total outlets increased from 0·10 to 0·67. The clustering of fast-food outlets was greatest within 800 m of schools and around the most socially deprived schools. Over the 40-year study period, school food environments in Christchurch can be characterized by increased densities of fast-food outlets within walking distance of schools, especially around the most deprived schools. Since the 1960s, there have been substantial changes to the food environments around schools which may increasingly facilitate away-from-home food consumption for children and provide easily accessible, cheap energy-dense foods, a recognized contributor to the rise in prevalence of overweight and obesity among young people.

  19. Mortality in Danish Swine herds: Spatio-temporal clusters and risk factors

    DEFF Research Database (Denmark)

    Lopes Antunes, Ana Carolina; Ersbøll, Annette Kjær; Bihrmann, Kristine

    2017-01-01

    -temporal analysis included data description for spatial, temporal, and spatio-temporal cluster analysis for three age groups: weaners (up to 30 kg), sows and finishers. Logistic regression models were used to assess the potential factors associated with finisher and weaner herds being included within multiple...

  20. Marketing research cluster analysis

    Directory of Open Access Journals (Sweden)

    Marić Nebojša

    2002-01-01

    Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.

  1. Spatio-temporal clustering of wildfires in Portugal

    Science.gov (United States)

    Costa, R.; Pereira, M. G.; Caramelo, L.; Vega Orozco, C.; Kanevski, M.

    2012-04-01

    Several studies have shown that wildfires in Portugal presenthigh temporal as well as high spatial variability (Pereira et al., 2005, 2011). The identification and characterization of spatio-temporal clusters contributes to a comprehensivecharacterization of the fire regime and to improve the efficiency of fire prevention and combat activities. The main goalsin this studyare: (i) to detect the spatio-temporal clusters of burned area; and, (ii) to characterize these clusters along with the role of human and environmental factors. The data were supplied by the National Forest Authority(AFN, 2011) and comprises: (a)the Portuguese Rural Fire Database, PRFD, (Pereira et al., 2011) for the 1980-2007period; and, (b) the national mapping burned areas between 1990 and 2009. In this work, in order to complement the more common cluster analysis algorithms, an alternative approach based onscan statistics and on the permutation modelwas used. This statistical methodallows the detection of local excess events and to test if such an excess can reasonably have occurred by chance.Results obtained for different simulations performed for different spatial and temporal windows are presented, compared and interpreted.The influence of several fire factors such as (climate, vegetation type, etc.) is also assessed. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005:"Synoptic patterns associated with large summer forest fires in Portugal".Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 AFN, 2011: AutoridadeFlorestalNacional (National Forest Authority). Available at http://www.afn.min-agricultura.pt/portal.

  2. CLEAN: CLustering Enrichment ANalysis

    Science.gov (United States)

    Freudenberg, Johannes M; Joshi, Vineet K; Hu, Zhen; Medvedovic, Mario

    2009-01-01

    Background Integration of biological knowledge encoded in various lists of functionally related genes has become one of the most important aspects of analyzing genome-wide functional genomics data. In the context of cluster analysis, functional coherence of clusters established through such analyses have been used to identify biologically meaningful clusters, compare clustering algorithms and identify biological pathways associated with the biological process under investigation. Results We developed a computational framework for analytically and visually integrating knowledge-based functional categories with the cluster analysis of genomics data. The framework is based on the simple, conceptually appealing, and biologically interpretable gene-specific functional coherence score (CLEAN score). The score is derived by correlating the clustering structure as a whole with functional categories of interest. We directly demonstrate that integrating biological knowledge in this way improves the reproducibility of conclusions derived from cluster analysis. The CLEAN score differentiates between the levels of functional coherence for genes within the same cluster based on their membership in enriched functional categories. We show that this aspect results in higher reproducibility across independent datasets and produces more informative genes for distinguishing different sample types than the scores based on the traditional cluster-wide analysis. We also demonstrate the utility of the CLEAN framework in comparing clusterings produced by different algorithms. CLEAN was implemented as an add-on R package and can be downloaded at . The package integrates routines for calculating gene specific functional coherence scores and the open source interactive Java-based viewer Functional TreeView (FTreeView). Conclusion Our results indicate that using the gene-specific functional coherence score improves the reproducibility of the conclusions made about clusters of co

  3. The SMART CLUSTER METHOD - adaptive earthquake cluster analysis and declustering

    Science.gov (United States)

    Schaefer, Andreas; Daniell, James; Wenzel, Friedemann

    2016-04-01

    Earthquake declustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity with usual applications comprising of probabilistic seismic hazard assessments (PSHAs) and earthquake prediction methods. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation. Various methods have been developed to address this issue from other researchers. These have differing ranges of complexity ranging from rather simple statistical window methods to complex epidemic models. This study introduces the smart cluster method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal identification. Hereby, an adaptive search algorithm for data point clusters is adopted. It uses the earthquake density in the spatio-temporal neighbourhood of each event to adjust the search properties. The identified clusters are subsequently analysed to determine directional anisotropy, focussing on a strong correlation along the rupture plane and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010/2011 Darfield-Christchurch events, an adaptive classification procedure is applied to disassemble subsequent ruptures which may have been grouped into an individual cluster using near-field searches, support vector machines and temporal splitting. The steering parameters of the search behaviour are linked to local earthquake properties like magnitude of completeness, earthquake density and Gutenberg-Richter parameters. The method is capable of identifying and classifying earthquake clusters in space and time. It is tested and validated using earthquake data from California and New Zealand. As a result of the cluster identification process, each event in

  4. Multilevel functional clustering analysis.

    Science.gov (United States)

    Serban, Nicoleta; Jiang, Huijing

    2012-09-01

    In this article, we investigate clustering methods for multilevel functional data, which consist of repeated random functions observed for a large number of units (e.g., genes) at multiple subunits (e.g., bacteria types). To describe the within- and between variability induced by the hierarchical structure in the data, we take a multilevel functional principal component analysis (MFPCA) approach. We develop and compare a hard clustering method applied to the scores derived from the MFPCA and a soft clustering method using an MFPCA decomposition. In a simulation study, we assess the estimation accuracy of the clustering membership and the cluster patterns under a series of settings: small versus moderate number of time points; various noise levels; and varying number of subunits per unit. We demonstrate the applicability of the clustering analysis to a real data set consisting of expression profiles from genes activated by immunity system cells. Prevalent response patterns are identified by clustering the expression profiles using our multilevel clustering analysis. © 2012, The International Biometric Society.

  5. Time-clustering investigation of fire temporal fluctuations in Portugal

    Directory of Open Access Journals (Sweden)

    L. Telesca

    2010-04-01

    Full Text Available Temporal clustering structures were identified and quantified in fire sequences recorded from 1980 to 2005 in Continental Portugal, by using the Allan Factor statistics, a statistical tool suited to reveal clustering behaviour in point processes. The obtained results show the presence of daily and annual periodicities, superimposed onto a scaling behaviour, which features the sequence of wildfires as a fractal time process with a rather high degree of time-clusterization of the events.

  6. Developing a Spatial-Temporal Contextual and Semantic Trajectory Clustering Framework

    OpenAIRE

    Portugal, Ivens; Alencar, Paulo; Cowan, Donald

    2017-01-01

    This paper reports on ongoing research investigating more expressive approaches to spatial-temporal trajectory clustering. Spatial-temporal data is increasingly becoming universal as a result of widespread use of GPS and mobile devices, which makes mining and predictive analyses based on trajectories a critical activity in many domains. Trajectory analysis methods based on clustering techniques heavily often rely on a similarity definition to properly provide insights. However, although traje...

  7. Interaction analysis through fuzzy temporal logic

    NARCIS (Netherlands)

    Ijsselmuiden, Joris; Dornheim, Johannes

    2015-01-01

    Interaction analysis is defined as the generation of semantic descriptions from machine perception. This can be achieved through a combination of fuzzy metric temporal logic (FMTL) and situation graph trees (SGTs). We extended the FMTL/SGT framework with modules for clustering and parameter

  8. Comprehensive cluster analysis with Transitivity Clustering.

    Science.gov (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  9. Spatio-temporal cluster analysis of the incidence of Campylobacter cases and patients with general diarrhea in a Danish county, 1995–2004

    Directory of Open Access Journals (Sweden)

    Simonsen Jacob

    2009-02-01

    Full Text Available Abstract Campylobacter infections are the main cause of bacterial gastroenteritis in Denmark. While primarily foodborne, Campylobacter infections are also to some degree acquired through other sources which may include contact with animals or the environment, locally contaminated drinking water and more. We analyzed Campylobacter cases for clustering in space and time for the large Danish island of Funen in the period 1995–2003, under the assumption that infections caused by 'environmental' factors may show persistent clustering while foodborne infections will occur randomly in space. Input data were geo-coded datasets of the addresses of laboratory-confirmed Campylobacter cases and of the background population of Funen County. The dataset had a spatial extent of 4.900 km2. Data were aggregated into units of analysis (so-called features of 5 km by 5 km times 1 year, and the Campylobacter incidence calculated. We used a modified form of local Moran's I to test if features with similar incidence rates occurred next to each other in space and time, and compared the observed clusters with simulated clusters. Because clusters may be caused by a high tendency among local GPs to submit stool samples, we also analyzed a dataset of all submitted stool samples for comparison. The results showed a significant persisting clustering of Campylobacter incidence rates in the Western part of Funen. Results were visualized using the Netlogo software. The underlying causes of the observed clustering are not known and will require further examination, but may be partially explained by an increased rate of stool samples submissions by physicians in the area. We hope, by this approach, to have developed a tool which will allow for analyses of geographical clusters which may in turn form a basis for further epidemiological examinations to cast light on the sources of infection.

  10. Relation chain based clustering analysis

    Science.gov (United States)

    Zhang, Cheng-ning; Zhao, Ming-yang; Luo, Hai-bo

    2011-08-01

    Clustering analysis is currently one of well-developed branches in data mining technology which is supposed to find the hidden structures in the multidimensional space called feature or pattern space. A datum in the space usually possesses a vector form and the elements in the vector represent several specifically selected features. These features are often of efficiency to the problem oriented. Generally, clustering analysis goes into two divisions: one is based on the agglomerative clustering method, and the other one is based on divisive clustering method. The former refers to a bottom-up process which regards each datum as a singleton cluster while the latter refers to a top-down process which regards entire data as a cluster. As the collected literatures, it is noted that the divisive clustering is currently overwhelming both in application and research. Although some famous divisive clustering methods are designed and well developed, clustering problems are still far from being solved. The k - means algorithm is the original divisive clustering method which initially assigns some important index values, such as the clustering number and the initial clustering prototype positions, and that could not be reasonable in some certain occasions. More than the initial problem, the k - means algorithm may also falls into local optimum, clusters in a rigid way and is not available for non-Gaussian distribution. One can see that seeking for a good or natural clustering result, in fact, originates from the one's understanding of the concept of clustering. Thus, the confusion or misunderstanding of the definition of clustering always derives some unsatisfied clustering results. One should consider the definition deeply and seriously. This paper demonstrates the nature of clustering, gives the way of understanding clustering, discusses the methodology of designing a clustering algorithm, and proposes a new clustering method based on relation chains among 2D patterns. In

  11. a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data

    Science.gov (United States)

    Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.

    2017-09-01

    Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.

  12. Remodularization Analysis Using Semantic Clustering

    OpenAIRE

    Santos, Gustavo; Tulio Valente, Marco; Anquetil, Nicolas

    2014-01-01

    International audience; In this paper, we report an experience on using and adapting Semantic Clustering to evaluate software remodularizations. Semantic Clustering is an approach that relies on information retrieval and clustering techniques to extract sets of similar classes in a system, according to their vocabularies. We adapted Semantic Clustering to support remodularization analysis. We evaluate our adaptation using six real-world remodularizations of four software systems. We report th...

  13. Temporal clustering of tropical cyclones and its ecosystem impacts.

    Science.gov (United States)

    Mumby, Peter J; Vitolo, Renato; Stephenson, David B

    2011-10-25

    Tropical cyclones have massive economic, social, and ecological impacts, and models of their occurrence influence many planning activities from setting insurance premiums to conservation planning. Most impact models allow for geographically varying cyclone rates but assume that individual storm events occur randomly with constant rate in time. This study analyzes the statistical properties of Atlantic tropical cyclones and shows that local cyclone counts vary in time, with periods of elevated activity followed by relative quiescence. Such temporal clustering is particularly strong in the Caribbean Sea, along the coasts of Belize, Honduras, Costa Rica, Jamaica, the southwest of Haiti, and in the main hurricane development region in the North Atlantic between Africa and the Caribbean. Failing to recognize this natural nonstationarity in cyclone rates can give inaccurate impact predictions. We demonstrate this by exploring cyclone impacts on coral reefs. For a given cyclone rate, we find that clustered events have a less detrimental impact than independent random events. Predictions using a standard random hurricane model were overly pessimistic, predicting reef degradation more than a decade earlier than that expected under clustered disturbance. The presence of clustering allows coral reefs more time to recover to healthier states, but the impacts of clustering will vary from one ecosystem to another.

  14. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  15. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    Science.gov (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  16. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...

  17. Spatial and Temporal Clustering in a Simple Earthquake Asperity Model

    Science.gov (United States)

    Tiampo, K. F.; Kazemian, J.; Dominguez, R.; Klein, W.

    2016-12-01

    Natural earthquake fault systems are highly heterogeneous in space, the result of inhomogeneities that are a function of the variety of materials of different strengths. However, despite their inhomogeneous nature, real faults are often modeled as spatially homogeneous systems. Here we present a simple earthquake fault model based on the Olami-Feder-Christensen (OFC) and Rundle-Jackson-Brown (RJB) cellular automata models with long-range interactions that incorporates asperities, or stronger sites, into the lattice (Rundle and Jackson, 1977; Olami et al., 1992). These asperity cells are significantly stronger than the surrounding lattice sites but eventually rupture when the applied stress reaches their higher threshold stress. The introduction of these spatial heterogeneities results in spatial and temporal clustering in the model similar to that seen in natural fault systems. We observe sequences of activity that begin with a gradually accelerating number of larger events, or foreshocks, prior to a large event, followed by a tail of decreasing activity, or aftershocks. These recurrent large events occur at regular intervals and the characteristic time between events and their magnitude are a function of the stress dissipation parameter. The relative length of the foreshock to aftershock sequence depends on the amount of stress dissipation in the system. This work provides further evidence that the spatial and temporal patterns observed in natural seismicity are strongly influenced by the underlying physical properties and are not solely the result of a simple cascade mechanism. We find that the scaling depends not only on the amount of damage, but also on the spatial distribution of that damage (Dominguez et al., 2011; Kazemian et al., 2014). Here we compare the modeled sequences to those of natural earthquake sequences from California and around the world in order to investigate the interplay between cascade dynamics and spatial structure.

  18. Clustering Vehicle Temporal and Spatial Travel Behavior Using License Plate Recognition Data

    Directory of Open Access Journals (Sweden)

    Huiyu Chen

    2017-01-01

    Full Text Available Understanding travel patterns of vehicle can support the planning and design of better services. In addition, vehicle clustering can improve management efficiency through more targeted access to groups of interest and facilitate planning by more specific survey design. This paper clustered 854,712 vehicles in a week using K-means clustering algorithm based on license plate recognition (LPR data obtained in Shenzhen, China. Firstly, several travel characteristics related to temporal and spatial variability and activity patterns are used to identify homogeneous clusters. Then, Davies-Bouldin index (DBI and Silhouette Coefficient (SC are applied to capture the optimal number of groups and, consequently, six groups are classified in weekdays and three groups are sorted in weekends, including commuting vehicles and some other occasional leisure travel vehicles. Moreover, a detailed analysis of the characteristics of each group in terms of spatial travel patterns and temporal changes are presented. This study highlights the possibility of applying LPR data for discovering the underlying factor in vehicle travel patterns and examining the characteristic of some groups specifically.

  19. Clustering of cases of type 1 diabetes in high socioeconomic communes in Santiago de Chile: spatio-temporal and geographical analysis.

    Science.gov (United States)

    Torres-Avilés, Francisco; Carrasco, Elena; Icaza, Gloria; Pérez-Bravo, Francisco

    2010-09-01

    The objective of this study was to describe spatial and space-time patterns of type 1 diabetes in children less than 15 years old, diagnosed between 2000 and 2005 with residence in the Metropolitan Region of Chile. Knox and Mantel tests were used to detect space-time interaction between cases. An ecological Bayesian model adjusted by socioeconomic factor and year was proposed to estimate the incidence by communes. Initially, there was no space-time interaction between cases, but there is evidence of clustering effect in urban areas of the region. The incidence rate for the overall study period was estimated by 6.18/100,000 (95% CI: 5.69-6.70), with a significant annual trend of 8.2% (P diabetes. Our findings support the hypothesis of an aetiological role of environmental factors in the onset of type 1 diabetes.

  20. Spatio-temporal analysis of Salmonella surveillance data in Thailand

    DEFF Research Database (Denmark)

    Coutinho Calado Domingues, Ana Rita; Vieira, Antonio; Hendriksen, Rene S.

    2014-01-01

    isolates from Thailand was analysed. Data was grouped into human and non-human categories and the analysis was performed for the top five occurring serovars for each year of the study period. A total 91 human and 39 non-human significant spatio-temporal clusters were observed, accounting for 11% and 16...

  1. Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Sreepathi, Sarat [ORNL; Kumar, Jitendra [ORNL; Mills, Richard T. [Argonne National Laboratory; Hoffman, Forrest M. [ORNL; Sripathi, Vamsi [Intel Corporation; Hargrove, William Walter [United States Department of Agriculture (USDA), United States Forest Service (USFS)

    2017-09-01

    A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like the Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.

  2. Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms

    Science.gov (United States)

    Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel

    2016-04-01

    Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and

  3. Time series analysis of temporal networks

    Science.gov (United States)

    Sikdar, Sandipan; Ganguly, Niloy; Mukherjee, Animesh

    2016-01-01

    A common but an important feature of all real-world networks is that they are temporal in nature, i.e., the network structure changes over time. Due to this dynamic nature, it becomes difficult to propose suitable growth models that can explain the various important characteristic properties of these networks. In fact, in many application oriented studies only knowing these properties is sufficient. For instance, if one wishes to launch a targeted attack on a network, this can be done even without the knowledge of the full network structure; rather an estimate of some of the properties is sufficient enough to launch the attack. We, in this paper show that even if the network structure at a future time point is not available one can still manage to estimate its properties. We propose a novel method to map a temporal network to a set of time series instances, analyze them and using a standard forecast model of time series, try to predict the properties of a temporal network at a later time instance. To our aim, we consider eight properties such as number of active nodes, average degree, clustering coefficient etc. and apply our prediction framework on them. We mainly focus on the temporal network of human face-to-face contacts and observe that it represents a stochastic process with memory that can be modeled as Auto-Regressive-Integrated-Moving-Average (ARIMA). We use cross validation techniques to find the percentage accuracy of our predictions. An important observation is that the frequency domain properties of the time series obtained from spectrogram analysis could be used to refine the prediction framework by identifying beforehand the cases where the error in prediction is likely to be high. This leads to an improvement of 7.96% (for error level ≤20%) in prediction accuracy on an average across all datasets. As an application we show how such prediction scheme can be used to launch targeted attacks on temporal networks. Contribution to the Topical Issue

  4. Detecting spatial-temporal clusters of HFMD from 2007 to 2011 in Shandong Province, China.

    Directory of Open Access Journals (Sweden)

    Yunxia Liu

    Full Text Available BACKGROUND: Hand, foot, and mouth disease (HFMD has caused major public health concerns worldwide, and has become one of the leading causes of children death. China is the most serious epidemic area with a total of 3,419,149 reported cases just from 2008 to 2010, and its different geographic areas might have different spatial epidemiology characteristics at different spatial-temporal scale levels. We conducted spatial and spatial-temporal epidemiology analysis to HFMD at county level in Shandong Province, China. METHODS: Based on the China National Disease Surveillance Reporting and Management System, the spatial-temporal database of HFMD from 2007 to 2011 was built. The global autocorrelation statistic (Moran's I was first used to detect the spatial autocorrelation of HFMD cases in each year. Purely Spatial scan statistics combined with Space-time scan statistic were used to detect epidemic clusters. RESULTS: The annual average incidence rate was 93.70 per 100,000 in Shandong Province. Most HFMD cases (93.94% were aged within 0-5 years old with an average male-to-female sex ratio 1.71, and the incidence seasonal peak was between April and July. The dominant pathogen was EV71 (47.35%, and CoxA16 (26.59%. HFMD had positive spatial autocorrelation at medium spatial scale level (county level with higher Moran's I from 0.31 to 0.62 (P<0.001. Seven spatial-temporal clusters were detected from 2007 to 2011 in the landscape of the whole Shandong, with EV71 or CoxA16 as the dominant pathogen for most hotspots areas. CONCLUSIONS: The spatial-temporal clusters of HFMD wandered around the whole Shandong Province during 2007 to 2011, with EV71 or CoxA16 as the dominant pathogen. These findings suggested that a real-time spatial-temporal surveillance system should be established for identifying high incidence region and conducting prevention to HFMD timely.

  5. Temporal Clustering and Sequencing in Short-Term Memory and Episodic Memory

    Science.gov (United States)

    Farrell, Simon

    2012-01-01

    A model of short-term memory and episodic memory is presented, with the core assumptions that (a) people parse their continuous experience into episodic clusters and (b) items are clustered together in memory as episodes by binding information within an episode to a common temporal context. Along with the additional assumption that information…

  6. Temporal Delineation and Quantification of Short Term Clustered Mining Seismicity

    Science.gov (United States)

    Woodward, Kyle; Wesseloo, Johan; Potvin, Yves

    2017-07-01

    The assessment of the temporal characteristics of seismicity is fundamental to understanding and quantifying the seismic hazard associated with mining, the effectiveness of strategies and tactics used to manage seismic hazard, and the relationship between seismicity and changes to the mining environment. This article aims to improve the accuracy and precision in which the temporal dimension of seismic responses can be quantified and delineated. We present a review and discussion on the occurrence of time-dependent mining seismicity with a specific focus on temporal modelling and the modified Omori law (MOL). This forms the basis for the development of a simple weighted metric that allows for the consistent temporal delineation and quantification of a seismic response. The optimisation of this metric allows for the selection of the most appropriate modelling interval given the temporal attributes of time-dependent mining seismicity. We evaluate the performance weighted metric for the modelling of a synthetic seismic dataset. This assessment shows that seismic responses can be quantified and delineated by the MOL, with reasonable accuracy and precision, when the modelling is optimised by evaluating the weighted MLE metric. Furthermore, this assessment highlights that decreased weighted MLE metric performance can be expected if there is a lack of contrast between the temporal characteristics of events associated with different processes.

  7. Clustering Vehicle Temporal and Spatial Travel Behavior Using License Plate Recognition Data

    OpenAIRE

    Huiyu Chen; Chao Yang; Xiangdong Xu

    2017-01-01

    Understanding travel patterns of vehicle can support the planning and design of better services. In addition, vehicle clustering can improve management efficiency through more targeted access to groups of interest and facilitate planning by more specific survey design. This paper clustered 854,712 vehicles in a week using K-means clustering algorithm based on license plate recognition (LPR) data obtained in Shenzhen, China. Firstly, several travel characteristics related to temporal and spati...

  8. Stability analysis in K-means clustering.

    Science.gov (United States)

    Steinley, Douglas

    2008-11-01

    This paper develops a new procedure, called stability analysis, for K-means clustering. Instead of ignoring local optima and only considering the best solution found, this procedure takes advantage of additional information from a K-means cluster analysis. The information from the locally optimal solutions is collected in an object by object co-occurrence matrix. The co-occurrence matrix is clustered and subsequently reordered by a steepest ascent quadratic assignment procedure to aid visual interpretation of the multidimensional cluster structure. Subsequently, measures are developed to determine the overall structure of a data set, the number of clusters and the multidimensional relationships between the clusters.

  9. Spatial-temporal distribution of dengue and climate characteristics for two clusters in Sri Lanka from 2012 to 2016.

    Science.gov (United States)

    Sun, Wei; Xue, Ling; Xie, Xiaoxue

    2017-10-10

    Dengue is a vector-borne disease causing high morbidity and mortality in tropical and subtropical countries. Urbanization, globalization, and lack of effective mosquito control have lead to dramatically increased frequency and magnitude of dengue epidemic in the past 40 years. The virus and the mosquito vectors keep expanding geographically in the tropical regions of the world. Using the hot spot analysis and the spatial-temporal clustering method, we investigated the spatial-temporal distribution of dengue in Sri Lanka from 2012 to 2016 to identify spatial-temporal clusters and elucidate the association of climatic factors with dengue incidence. We detected two important spatial-temporal clusters in Sri Lanka. Dengue incidences were predicted by combining historical dengue incidence data with climate data, and hot and cold spots were forecasted using the predicted dengue incidences to identify areas at high risks. Targeting the hot spots during outbreaks instead of all the regions can save resources and time for public health authorities. Our study helps better understand how climatic factors impact spatial and temporal spread of dengue virus. Hot spot prediction helps public health authorities forecast future high risk areas and direct control measures to minimize cost on health, time, and economy.

  10. The use of the temporal scan statistic to detect methicillin-resistant Staphylococcus aureus clusters in a community hospital.

    Science.gov (United States)

    Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott

    2014-07-08

    In healthcare facilities, conventional surveillance techniques using rule-based guidelines may result in under- or over-reporting of methicillin-resistant Staphylococcus aureus (MRSA) outbreaks, as these guidelines are generally unvalidated. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting MRSA clusters, validate clusters using molecular techniques and hospital records, and determine significant differences in the rate of MRSA cases using regression models. Patients admitted to a community hospital between August 2006 and February 2011, and identified with MRSA>48 hours following hospital admission, were included in this study. Between March 2010 and February 2011, MRSA specimens were obtained for spa typing. MRSA clusters were investigated using a retrospective temporal scan statistic. Tests were conducted on a monthly scale and significant clusters were compared to MRSA outbreaks identified by hospital personnel. Associations between the rate of MRSA cases and the variables year, month, and season were investigated using a negative binomial regression model. During the study period, 735 MRSA cases were identified and 167 MRSA isolates were spa typed. Nine different spa types were identified with spa type 2/t002 (88.6%) the most prevalent. The temporal scan statistic identified significant MRSA clusters at the hospital (n=2), service (n=16), and ward (n=10) levels (P ≤ 0.05). Seven clusters were concordant with nine MRSA outbreaks identified by hospital staff. For the remaining clusters, seven events may have been equivalent to true outbreaks and six clusters demonstrated possible transmission events. The regression analysis indicated years 2009-2011, compared to 2006, and months March and April, compared to January, were associated with an increase in the rate of MRSA cases (P ≤ 0.05). The application of the temporal scan statistic identified several MRSA clusters that were not detected by hospital

  11. Tanzania: A Hierarchical Cluster Analysis Approach | Ngaruko ...

    African Journals Online (AJOL)

    Using survey data from Kibondo district, west Tanzania, we use hierarchical cluster analysis to classify borrower farmers according to their borrowing behaviour into four distinctive clusters. The appreciation of the existence of heterogeneous farmer clusters is vital in forging credit delivery policies that are not only ...

  12. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...

  13. Cluster analysis in phenotyping a Portuguese population.

    Science.gov (United States)

    Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J

    2015-09-03

    Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.

  14. Cluster-based exposure variation analysis

    Science.gov (United States)

    2013-01-01

    Background Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. Methods For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity. Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. Results C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate

  15. Temporal clustering of tropical cyclones on the Great Barrier Reef and its ecological importance

    Science.gov (United States)

    Wolff, Nicholas H.; Wong, Aaron; Vitolo, Renato; Stolberg, Kristin; Anthony, Kenneth R. N.; Mumby, Peter J.

    2016-06-01

    Tropical cyclones have been a major cause of reef coral decline during recent decades, including on the Great Barrier Reef (GBR). While cyclones are a natural element of the disturbance regime of coral reefs, the role of temporal clustering has previously been overlooked. Here, we examine the consequences of different types of cyclone temporal distributions (clustered, stochastic or regular) on reef ecosystems. We subdivided the GBR into 14 adjoining regions, each spanning roughly 300 km, and quantified both the rate and clustering of cyclones using dispersion statistics. To interpret the consequences of such cyclone variability for coral reef health, we used a model of observed coral population dynamics. Results showed that clustering occurs on the margins of the cyclone belt, being strongest in the southern reefs and the far northern GBR, which also has the lowest cyclone rate. In the central GBR, where rates were greatest, cyclones had a relatively regular temporal pattern. Modelled dynamics of the dominant coral genus, Acropora, suggest that the long-term average cover might be more than 13 % greater (in absolute cover units) under a clustered cyclone regime compared to stochastic or regular regimes. Thus, not only does cyclone clustering vary significantly along the GBR but such clustering is predicted to have a marked, and management-relevant, impact on the status of coral populations. Additionally, we use our regional clustering and rate results to sample from a library of over 7000 synthetic cyclone tracks for the GBR. This allowed us to provide robust reef-scale maps of annual cyclone frequency and cyclone impacts on Acropora. We conclude that assessments of coral reef vulnerability need to account for both spatial and temporal cyclone distributions.

  16. Hanseniaspora uvarum from Winemaking Environments Show Spatial and Temporal Genetic Clustering

    Science.gov (United States)

    Albertin, Warren; Setati, Mathabatha E.; Miot-Sertier, Cécile; Mostert, Talitha T.; Colonna-Ceccaldi, Benoit; Coulon, Joana; Girard, Patrick; Moine, Virginie; Pillet, Myriam; Salin, Franck; Bely, Marina; Divol, Benoit; Masneuf-Pomarede, Isabelle

    2016-01-01

    Hanseniaspora uvarum is one of the most abundant yeast species found on grapes and in grape must, at least before the onset of alcoholic fermentation (AF) which is usually performed by Saccharomyces species. The aim of this study was to characterize the genetic and phenotypic variability within the H. uvarum species. One hundred and fifteen strains isolated from winemaking environments in different geographical origins were analyzed using 11 microsatellite markers and a subset of 47 strains were analyzed by AFLP. H. uvarum isolates clustered mainly on the basis of their geographical localization as revealed by microsatellites. In addition, a strong clustering based on year of isolation was evidenced, indicating that the genetic diversity of H. uvarum isolates was related to both spatial and temporal variations. Conversely, clustering analysis based on AFLP data provided a different picture with groups showing no particular characteristics, but provided higher strain discrimination. This result indicated that AFLP approaches are inadequate to establish the genetic relationship between individuals, but allowed good strain discrimination. At the phenotypic level, several extracellular enzymatic activities of enological relevance (pectinase, chitinase, protease, β-glucosidase) were measured but showed low diversity. The impact of environmental factors of enological interest (temperature, anaerobia, and copper addition) on growth was also assessed and showed poor variation. Altogether, this work provided both new analytical tool (microsatellites) and new insights into the genetic and phenotypic diversity of H. uvarum, a yeast species that has previously been identified as a potential candidate for co-inoculation in grape must, but whose intraspecific variability had never been fully assessed. PMID:26834719

  17. Hanseniaspora uvarum from winemaking environments show spatial and temporal genetic clustering

    Directory of Open Access Journals (Sweden)

    Warren eAlbertin

    2016-01-01

    Full Text Available Hanseniaspora uvarum is one of the most abundant yeast species found on grapes and in grape must, at least before the onset of alcoholic fermentation which is usually performed by Saccharomyces species. The aim of this study was to characterise the genetic and phenotypic variability within the H. uvarum species. One hundred and fifteen strains isolated from winemaking environments in different geographical origins were analysed using 11 microsatellite markers and a subset of 47 strains were analysed by AFLP. H. uvarum isolates clustered mainly on the basis of their geographical localisation as revealed by microsatellites. In addition, a strong clustering based on year of isolation was evidenced, indicating that the genetic diversity of Hanseniaspora uvarum isolates was related to both spatial and temporal variations. Conversely, clustering analysis based on AFLP data provided a different picture with groups showing no particular characteristics, but provided higher strain discrimination. This result indicated that AFLP approaches are inadequate to establish the genetic relationship between individuals, but allowed good strain discrimination. At the phenotypic level, several extracellular enzymatic activities of enological relevance (pectinase, chitinase, protease, β-glucosidase were measured but showed low diversity. The impact of environmental factors of enological interest (temperature, anaerobia and copper addition on growth was also assessed and showed poor variation. Altogether, this work provided both new analytical tool (microsatellites and new insights into the genetic and phenotypic diversity of H. uvarum, a yeast species that has previously been identified as a potential candidate for co-inoculation in grape must, but whose intraspecific variability had never been fully assessed.

  18. The Home Care Crew Scheduling Problem: Preference-based visit clustering and temporal dependencies

    DEFF Research Database (Denmark)

    Rasmussen, Matias Sevel; Justesen, Tor Fog; Dohn, Anders Høeg

    2012-01-01

    branch-and-price solution algorithm, as this method has previously given solid results for classical vehicle routing problems. Temporal dependencies are modelled as generalised precedence constraints and enforced through the branching. We introduce a novel visit clustering approach based on the soft...

  19. Multi-temporal clustering of continental floods and associated atmospheric circulations

    Science.gov (United States)

    Liu, Jianyu; Zhang, Yongqiang

    2017-12-01

    Investigating clustering of floods has important social, economic and ecological implications. This study examines the clustering of Australian floods at different temporal scales and its possible physical mechanisms. Flood series with different severities are obtained by peaks-over-threshold (POT) sampling in four flood thresholds. At intra-annual scale, Cox regression and monthly frequency methods are used to examine whether and when the flood clustering exists, respectively. At inter-annual scale, dispersion indices with four-time variation windows are applied to investigate the inter-annual flood clustering and its variation. Furthermore, the Kernel occurrence rate estimate and bootstrap resampling methods are used to identify flood-rich/flood-poor periods. Finally, seasonal variation of horizontal wind at 850 hPa and vertical wind velocity at 500 hPa are used to investigate the possible mechanisms causing the temporal flood clustering. Our results show that: (1) flood occurrences exhibit clustering at intra-annual scale, which are regulated by climate indices representing the impacts of the Pacific and Indian Oceans; (2) the flood-rich months occur from January to March over northern Australia, and from July to September over southwestern and southeastern Australia; (3) stronger inter-annual clustering takes place across southern Australia than northern Australia; and (4) Australian floods are characterised by regional flood-rich and flood-poor periods, with 1987-1992 identified as the flood-rich period across southern Australia, but the flood-poor period across northern Australia, and 2001-2006 being the flood-poor period across most regions of Australia. The intra-annual and inter-annual clustering and temporal variation of flood occurrences are in accordance with the variation of atmospheric circulation. These results provide relevant information for flood management under the influence of climate variability, and, therefore, are helpful for developing

  20. Nonnegative Matrix Factorization-Based Spatial-Temporal Clustering for Multiple Sensor Data Streams

    Directory of Open Access Journals (Sweden)

    Di-Hua Sun

    2014-01-01

    Full Text Available Cyber physical systems have grown exponentially and have been attracting a lot of attention over the last few years. To retrieve and mine the useful information from massive amounts of sensor data streams with spatial, temporal, and other multidimensional information has become an active research area. Moreover, recent research has shown that clusters of streams change with a comprehensive spatial-temporal viewpoint in real applications. In this paper, we propose a spatial-temporal clustering algorithm (STClu based on nonnegative matrix trifactorization by utilizing time-series observational data streams and geospatial relationship for clustering multiple sensor data streams. Instead of directly clustering multiple data streams periodically, STClu incorporates the spatial relationship between two sensors in proximity and integrates the historical information into consideration. Furthermore, we develop an iterative updating optimization algorithm STClu. The effectiveness and efficiency of the algorithm STClu are both demonstrated in experiments on real and synthetic data sets. The results show that the proposed STClu algorithm outperforms existing methods for clustering sensor data streams.

  1. Spatial clustering in the spatio-temporal dynamics of endemic cholera

    Directory of Open Access Journals (Sweden)

    Emch Michael

    2010-03-01

    Full Text Available Abstract Background The spatio-temporal patterns of infectious diseases that are environmentally driven reflect the combined effects of transmission dynamics and environmental heterogeneity. They contain important information on different routes of transmission, including the role of environmental reservoirs. Consideration of the spatial component in infectious disease dynamics has led to insights on the propagation of fronts at the level of counties in rabies in the US, and the metapopulation behavior at the level of cities in childhood diseases such as measles in the UK, both at relatively coarse scales. As epidemiological data on individual infections become available, spatio-temporal patterns can be examined at higher resolutions. Methods The extensive spatio-temporal data set for cholera in Matlab, Bangladesh, maps the individual location of cases from 1983 to 2003. This unique record allows us to examine the spatial structure of cholera outbreaks, to address the role of primary transmission, occurring from an aquatic reservoir to the human host, and that of secondary transmission, involving a feedback between current and past levels of infection. We use Ripley's K and L indices and bootstrapping methods to evaluate the occurrence of spatial clustering in the cases during outbreaks using different temporal windows. The spatial location of cases was also confronted against the spatial location of water sources. Results Spatial clustering of cholera cases was detected at different temporal and spatial scales. Cases relative to water sources also exhibit spatial clustering. Conclusions The clustering of cases supports an important role of secondary transmission in the dynamics of cholera epidemics in Matlab, Bangladesh. The spatial clustering of cases relative to water sources, and its timing, suggests an effective role of water reservoirs during the onset of cholera outbreaks. Once primary transmission has initiated an outbreak, secondary

  2. Cluster analysis of rural, urban, and curbside atmospheric particle size data.

    Science.gov (United States)

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

    2009-07-01

    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  3. [Cluster analysis and its application].

    Science.gov (United States)

    Půlpán, Zdenĕk

    2002-01-01

    The study exploits knowledge-oriented and context-based modification of well-known algorithms of (fuzzy) clustering. The role of fuzzy sets is inherently inclined towards coping with linguistic domain knowledge also. We try hard to obtain from rich diverse data and knowledge new information about enviroment that is being explored.

  4. Spatial and Temporal Assessment on Drug Addiction Using Multivariate Analysis and GIS

    International Nuclear Information System (INIS)

    Mohd Ekhwan Toriman; Mohd Ekhwan Toriman; Siti Nor Fazillah Abdullah; Izwan Arif Azizan; Mohd Khairul Amri Kamarudin; Roslan Umar; Nasir Mohamad

    2015-01-01

    There is a need for managing and displaying drug addiction phenomena and trend at both spatial and temporal scales. Spatial and temporal assessment on drug addiction in Terengganu was undertaken to understand the geographical area of district in the same cluster, in addition, identify the hot spot area of this problem and analysis the trend of drug addiction. Data used were topography map of Terengganu and number of drug addicted person in Terengganu by district within 10 years (2004-2013). Number of drug addicted person by district were mapped using Geographic Information system and analysed using a combination of multivariate analysis which is cluster analysis were applied to the database in order to validate the correlation between data in the same cluster. Result showed a cluster analysis for number of drug addiction by district generated three clusters which are Besut and Kuala Terengganu in cluster 1 named moderate drug addicted person (MDA), Dungun, Marang, Setiu and Hulu Terengganu in cluster 2 named lower drug addicted person (LDA) and Kemaman in cluster 3 named high drug addicted person(HDA). This analysis indicates that cluster 3 which is Kemaman is a hot spot area. These results were beneficial for stakeholder to monitor and manage this problem especially in the hot spot area which needs to be emphasized. (author)

  5. Differential Spatio-temporal Multiband Satellite Image Clustering using K-means Optimization With Reinforcement Programming

    Directory of Open Access Journals (Sweden)

    Irene Erlyn Wina Rachmawan

    2015-06-01

    Full Text Available Deforestration is one of the crucial issues in Indonesia because now Indonesia has world's highest deforestation rate. In other hand, multispectral image delivers a great source of data for studying spatial and temporal changeability of the environmental such as deforestration area. This research present differential image processing methods for detecting nature change of deforestration. Our differential image processing algorithms extract and indicating area automatically. The feature of our proposed idea produce extracted information from multiband satellite image and calculate the area of deforestration by years with calculating data using temporal dataset. Yet, multiband satellite image consists of big data size that were difficult to be handled for segmentation. Commonly, K- Means clustering is considered to be a powerfull clustering algorithm because of its ability to clustering big data. However K-Means has sensitivity of its first generated centroids, which could lead into a bad performance. In this paper we propose a new approach to optimize K-Means clustering using Reinforcement Programming in order to clustering multispectral image. We build a new mechanism for generating initial centroids by implementing exploration and exploitation knowledge from Reinforcement Programming. This optimization will lead a better result for K-means data cluster. We select multispectral image from Landsat 7 in past ten years in Medawai, Borneo, Indonesia, and apply two segmentation areas consist of deforestration land and forest field. We made series of experiments and compared the experimental results of K-means using Reinforcement Programming as optimizing initiate centroid and normal K-means without optimization process. Keywords: Deforestration, Multispectral images, landsat, automatic clustering, K-means.

  6. Assymetry of temporal artery diameters during spontaneous attacks of cluster headache

    DEFF Research Database (Denmark)

    Nielsen, Thue H; Tfelt-Hansen, Peer; Iversen, Helle K

    2009-01-01

    BACKGROUND: Cluster headache is characterized by strictly unilateral head pain associated with symptoms of cranial autonomic features. Transcranial Doppler studies showed in most studies a bilateral decreased blood flow velocity in the middle cerebral artery. OBJECTIVE: To investigate whether the...... = .67). CONCLUSIONS: What was observed is most likely a general pain-induced arterial vasoconstriction (confer the decrease in diameter on the pain-free side) with an unchanged superficial temporal artery on the pain side because of some vasodilator influence....

  7. Temporal retinal nerve fibre layer thinning in cluster headache patients detected by optical coherence tomography.

    Science.gov (United States)

    Ewering, Carina; Haşal, Nazmiye; Alten, Florian; Clemens, Christoph R; Eter, Nicole; Oberwahrenbrock, Timm; Kadas, Ella M; Zimmermann, Hanna; Brandt, Alexander U; Osada, Nani; Paul, Friedemann; Marziniak, Martin

    2015-10-01

    The exact pathophysiology of cluster headache (CH) is still not fully clarified. Various studies confirmed changes in ocular blood flow during CH attacks. Furthermore, vasoconstricting medication influences blood supply to the eye. We investigated the retina of CH patients for structural retinal alterations with optical coherence tomography (OCT), and how these changes correlate to headache characteristics, oxygen use and impaired visual function. Spectral domain OCT of 107 CH patients - 67 episodic, 35 chronic, five former chronic sufferers - were compared to OCT from 65 healthy individuals. Visual function tests with Sloan charts and a substantial ophthalmologic examination were engaged. Reduction of temporal and temporal-inferior retinal nerve fibre layer (RNFL) thickness was found in both eyes for CH patients with a predominant thinning on the headache side in the temporal-inferior area. Chronic CH patients revealed thinning of the macula compared to episodic suffers and healthy individuals. Bilateral thinning of temporal RNFL was also found in users of 100% oxygen compared to non-users and healthy controls. Visual function did not differ between patients and controls. Our OCT findings show a systemic effect causing temporal retinal thinning in both eyes of CH patients possibly due to attack-inherent or medication-induced frequent bilateral vessel diameter changes. The temporal retina with its thinly myelinated parvo-cellular axons and its more susceptible vessels for the vasoconstricting influence of oxygen inhalation seems to be predisposed for tissue damage-causing processes related to CH. © International Headache Society 2015.

  8. Indexing, Query Processing, and Clustering of Spatio-Temporal Text Objects

    DEFF Research Database (Denmark)

    Skovsgaard, Anders

    With the increasing mobile use of the web from geo-positioned devices, the Internet is increasingly acquiring a spatial aspect, with still more types of content being geo-tagged. As a result of this development, a wide range of location-aware queries and applications have emerged. The large amoun...... partial results. The results shows excellent indexing and query execution performance on a standard DBMS......) spatio-temporal aggregates, and (iii) spatio-textual region querying without special purpose index structures. First, two novel techniques to perform grouping of spatio-textual objects are presented. In the first technique, top-k groups of objects are returned while taking into account aspects......, the grouping of spatio-textual objects is done without considering query locations, and a clustering approach is proposed that takes into account both the spatial and textual attributes of the objects. The technique expands clusters based on a proposed quality function that enables clusters of arbitrary shape...

  9. Down's syndrome clusters in Germany in close temporal relationship to the Chernobyl accident

    International Nuclear Information System (INIS)

    Grosche, B.; Schoetzau, A.; Burkart, W.

    1997-01-01

    In two independent studies using different approaches and covering West Berlin and Bavaria, respectively, highly significant temporal clusters of Down's syndrome were found. Both sharp increases occurred in areas receiving relatively low Chernobyl fallout and concomitant radiation exposures. Only for the Berlin cluster was fallout present at the time of the affected meioses, whereas the Nuremberg cluster preceded the radioactive contamination for one month. Hypotheses on possible causal relationships are compared. Radiation from the Chernobyl accident is an unlikely factor, also, because the associated cumulative dose was so low in comparison with natural background. Given the lack of understanding of what causes Down's syndrome, other than factors associated with increased maternal age, additional research into environmental and infectious risk factors is warranted. (author)

  10. ASteCA: Automated Stellar Cluster Analysis

    Science.gov (United States)

    Perren, G. I.; Vázquez, R. A.; Piatti, A. E.

    2015-04-01

    We present the Automated Stellar Cluster Analysis package (ASteCA), a suit of tools designed to fully automate the standard tests applied on stellar clusters to determine their basic parameters. The set of functions included in the code make use of positional and photometric data to obtain precise and objective values for a given cluster's center coordinates, radius, luminosity function and integrated color magnitude, as well as characterizing through a statistical estimator its probability of being a true physical cluster rather than a random overdensity of field stars. ASteCA incorporates a Bayesian field star decontamination algorithm capable of assigning membership probabilities using photometric data alone. An isochrone fitting process based on the generation of synthetic clusters from theoretical isochrones and selection of the best fit through a genetic algorithm is also present, which allows ASteCA to provide accurate estimates for a cluster's metallicity, age, extinction and distance values along with its uncertainties. To validate the code we applied it on a large set of over 400 synthetic MASSCLEAN clusters with varying degrees of field star contamination as well as a smaller set of 20 observed Milky Way open clusters (Berkeley 7, Bochum 11, Czernik 26, Czernik 30, Haffner 11, Haffner 19, NGC 133, NGC 2236, NGC 2264, NGC 2324, NGC 2421, NGC 2627, NGC 6231, NGC 6383, NGC 6705, Ruprecht 1, Tombaugh 1, Trumpler 1, Trumpler 5 and Trumpler 14) studied in the literature. The results show that ASteCA is able to recover cluster parameters with an acceptable precision even for those clusters affected by substantial field star contamination. ASteCA is written in Python and is made available as an open source code which can be downloaded ready to be used from its official site.

  11. Cluster analysis of pharmacists' work attitudes.

    Science.gov (United States)

    Nakagomi, Keiichi; Hayashi, Yukikazu; Komiyama, Takako

    2017-12-01

    Few studies in Japan use clustering to examine the work attitudes of pharmacists. This study conducts an exploratory analysis to classify those attitudes based on previous studies to help staff pharmacists and their management to understand their mutually beneficial requirements. Survey data collected in previous studies from 1 228 community pharmacists and 419 hospital pharmacists were analyzed using Quantification Theory 3 and clustering. Among community pharmacists, two clusters, namely 30- to 34-year-old married males and married males aged over 35 years, reported the highest job satisfaction, intending to remain in their jobs for 5 years or more or until retirement. Conversely, one cluster of 35- to 39-year-old single females reported the lowest job satisfaction and intended to remain for less than 5  years or were undecided. Among hospital pharmacists, one cluster of 22- to 25-year-old single males reported the highest job satisfaction and intended to remain for more than 5 years. Conversely, one cluster of 30- to 34-year-old married males reported the lowest job satisfaction and a period of working undetermined. This study used clustering to explore how pharmacists of different ages, marital statuses, and experience felt regarding their work. Job satisfaction and human relationships are significant in considering future work plans of practicing pharmacists. Pharmacy staff, supervisors, and managers of community or hospital pharmacies must recognize features of pharmacists' work attitudes for offering high-quality service to patients.

  12. Spatio-temporal clustering of hand, foot, and mouth disease at the county level in Guangxi, China.

    Science.gov (United States)

    Xie, Yi-hong; Chongsuvivatwong, Virasakdi; Tang, Zhenzhu; McNeil, Edward B; Tan, Yi

    2014-01-01

    Amid numerous outbreaks of hand, foot and mouth disease (HFMD) in Asia over the past decade, studies on spatio-temporal clustering are limited. Without this information the distribution of severe cases assumed to be sporadic. We analyzed surveillance data with onset dates between 1 May 2008 to 31 October 2013 with the aim to document the spatio-temporal clustering of HFMD cases and severe cases at the county level. Purely temporal and purely spatial descriptive analyses were done. These were followed by a space-time scan statistic for the whole study period and by year to detect the high risk clusters based on a discrete Poisson model. The annual incidence rate of HFMD in Guangxi increased whereas the severe cases peaked in 2010 and 2012. EV71 and CoxA16 were alternating viruses. Both HFMD cases and severe cases had a seasonal peak in April to July. The spatio-temporal cluster of HFMD cases were mainly detected in the northeastern, central and southwestern regions, among which three clusters were observed in Nanning, Liuzhou, Guilin city and their neighbouring areas lasting from 1.2 to 2.5 years. The clusters of severe cases were less consistent in location and included around 40-70% of all severe cases in each year. Both HFMD cases and severe cases occur in spatio-temporal clusters. The continuous epidemic in Nanning, Liuzhou, Guilin cities and their neighbouring areas and the clusters of severe cases indicate the need for further intensive surveillance.

  13. Fuzzy clustering analysis of microarray data.

    Science.gov (United States)

    Han, Lixin; Zeng, Xiaoqin; Yan, Hong

    2008-10-01

    Fuzzy clustering is a useful tool for identifying relevant subsets of microarray data. This paper proposes a fuzzy clustering method for microarray data analysis. An advantage of the method is that it used a combination of the fuzzy c-means and the principal component analysis to identify the groups of genes that show similar expression patterns. It allows a gene to belong to more than a gene expression pattern with different membership grades. The method is suitable for the analysis of large amounts of noisy microarray data.

  14. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing...

  16. Runtime Analysis of Linear Temporal Logic Specifications

    Science.gov (United States)

    Giannakopoulou, Dimitra; Havelund, Klaus

    2001-01-01

    This report presents an approach to checking a running program against its Linear Temporal Logic (LTL) specifications. LTL is a widely used logic for expressing properties of programs viewed as sets of executions. Our approach consists of translating LTL formulae to finite-state automata, which are used as observers of the program behavior. The translation algorithm we propose modifies standard LTL to B chi automata conversion techniques to generate automata that check finite program traces. The algorithm has been implemented in a tool, which has been integrated with the generic JPaX framework for runtime analysis of Java programs.

  17. Spatial and temporal changes of vegetation information in the karst peak cluster area, Guilin

    Science.gov (United States)

    Liu, Chao; Wu, Hong

    2014-05-01

    The karst peak clusters are main type of Karst landscape interspersed along Li River in Guilin. Their situation of ecological environment has impact on the environmental change of Guilin city directly and indirectly. To study the temporal and spatial characteristics of the impacts, two determinate region information, NDVI and TC2 were extracted from Landsat TM data at the eight areas of Karst peak cluster. The results showed that the values of NDVI and TC2 of the some karst peak cluster have changing trend from high to low then higher for different time, that is, from 1986, 1991 to 2006, and the values of NDVI TC2 of the different Karst peak cluster are different for same period time. It has relation in value from high to low, that is, No.5>No.4>No.3>No.8>No.7>No.6>No.2>No. As a result, ecological environment of Guilin city has underground unsymmetrical change in time and space during past 20 years namely from 1986 to 2006. The studying achievement can be foundation in science and technology for synthetically govern to ecological environment of Guilin city in 21 century.

  18. Spatial, temporal and spatio-temporal clusters of measles incidence at the county level in Guangxi, China during 2004-2014: flexibly shaped scan statistics.

    Science.gov (United States)

    Tang, Xianyan; Geater, Alan; McNeil, Edward; Deng, Qiuyun; Dong, Aihu; Zhong, Ge

    2017-04-04

    Outbreaks of measles re-emerged in Guangxi province during 2013-2014, where measles again became a major public health concern. A better understanding of the patterns of measles cases would help in identifying high-risk areas and periods for optimizing preventive strategies, yet these patterns remain largely unknown. Thus, this study aimed to determine the patterns of measles clusters in space, time and space-time at the county level over the period 2004-2014 in Guangxi. Annual data on measles cases and population sizes for each county were obtained from Guangxi CDC and Guangxi Bureau of Statistics, respectively. Epidemic curves and Kulldorff's temporal scan statistics were used to identify seasonal peaks and high-risk periods. Tango's flexible scan statistics were implemented to determine irregular spatial clusters. Spatio-temporal clusters in elliptical cylinder shapes were detected by Kulldorff's scan statistics. Population attributable risk percent (PAR%) of children aged ≤24 months was used to identify regions with a heavy burden of measles. Seasonal peaks occurred between April and June, and a temporal measles cluster was detected in 2014. Spatial clusters were identified in West, Southwest and North Central Guangxi. Three phases of spatio-temporal clusters with high relative risk were detected: Central Guangxi during 2004-2005, Midwest Guangxi in 2007, and West and Southwest Guangxi during 2013-2014. Regions with high PAR% were mainly clustered in West, Southwest, North and Central Guangxi. A temporal uptrend of measles incidence existed in Guangxi between 2010 and 2014, while downtrend during 2004-2009. The hotspots shifted from Central to West and Southwest Guangxi, regions overburdened with measles. Thus, intensifying surveillance of timeliness and completeness of routine vaccination and implementing supplementary immunization activities for measles should prioritized in these regions.

  19. Cluster Analysis of Properties of Temperament

    Directory of Open Access Journals (Sweden)

    A I Krupnov

    2014-12-01

    Full Text Available The paper presents the cluster analysis of various properties of temperament, based on the systematic structure of its main components. On the basis of the received data the qualitative psychological characteristic of the four types of temperament is given.

  20. Cluster analysis for determining distribution center location

    Science.gov (United States)

    Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian

    2017-12-01

    Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.

  1. Spatial-Temporal Clusters and Risk Factors of Hand, Foot, and Mouth Disease at the District Level in Guangdong Province, China

    Science.gov (United States)

    Yu, Shicheng; Gu, Jing; Huang, Cunrui; Xiao, Gexin; Hao, Yuantao

    2013-01-01

    Objective Hand, foot, and mouth disease (HFMD) has posed a great threat to the health of children and become a public health priority in China. This study aims to investigate the epidemiological characteristics, spatial-temporal patterns, and risk factors of HFMD in Guangdong Province, China, and to provide scientific information for public health responses and interventions. Methods HFMD surveillance data from May 2008 to December 2011were provided by the Chinese Center for Disease Control and Prevention. We firstly conducted a descriptive analysis to evaluate the epidemic characteristics of HFMD. Then, Kulldorff scan statistic based on a discrete Poisson model was used to detect spatial-temporal clusters. Finally, a spatial paneled model was applied to identify the risk factors. Results A total of 641,318 HFMD cases were reported in Guangdong Province during the study period (total population incidence: 17.51 per 10,000). Male incidence was higher than female incidence for all age groups, and approximately 90% of the cases were children years old. Spatial-temporal cluster analysis detected four most likely clusters and several secondary clusters (P<0.001) with the maximum cluster size 50% and 20% respectively during 2008–2011. Monthly average temperature, relative humidity, the proportion of population years, male-to-female ratio, and total sunshine were demonstrated to be the risk factors for HFMD. Conclusion Children years old, especially boys, were more susceptible to HFMD and we should take care of their vulnerability. Provincial capital city Guangzhou and the Pearl River Delta regions had always been the spatial-temporal clusters and future public health planning and resource allocation should be focused on these areas. Furthermore, our findings showed a strong association between HFMD and meteorological factors, which may assist in predicting HFMD incidence. PMID:23437278

  2. Fuzzy clustering analysis of osteosarcoma related genes.

    Science.gov (United States)

    Chen, Kai; Wu, Dajiang; Bai, Yushu; Zhu, Xiaodong; Chen, Ziqiang; Wang, Chuanfeng; Zhao, Yingchuan; Li, Ming

    2014-07-01

    Osteosarcoma is the most common malignant bone-tumor with a peak manifestation during the second and third decade of life. In order to explore the influence of genetic factors on the mechanism of osteosarcoma by analyzing the inter relationship between osteosarcoma and its related genes, and then provide potential genetic references for the prevention, diagnosis and treatment of osteosarcoma, we collected osteosarcoma related gene sequences in Genebank of National Center for Biotechnology Information (NCBI) and local alignment analysis for a pair of sequences was carried out to identify the measurement association among related sequences. Then fuzzy clustering method was used for clustering analysis so as to contact the unknown genes through the consistent osteosarcoma related genes in one class. From the result of fuzzy clustering analysis, we could classify the osteosarcoma related genes into two groups and deduced that the genes clustered into one group had similar function. Based on this knowledge, we found more genes related to the pathogenesis of osteosarcoma and these genes could exert similar function as Runx2, a risk factor confirmed in osteosarcoma, this study may help better understand the genetic mechanism and provide new molecular markers and therapies for osteosarcoma.

  3. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    Science.gov (United States)

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  4. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations.

    Science.gov (United States)

    Corrigan, Neil; Bankart, Michael J G; Gray, Laura J; Smith, Karen L

    2014-05-24

    There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations include avoidance of cluster merges where

  5. Spatial and temporal clustering of mortality in Digkale HDSS in rural northern South Africa

    Directory of Open Access Journals (Sweden)

    Chifundo Kanjala

    2010-08-01

    Full Text Available Background: Mortality data are frequently presented at the overall population level, possibly obscuring small-scale variations over time and space and between different population sub-groups. Objective: Analysis of mortality data from the Dikgale Health and Demographic Surveillance System, in rural South Africa, over the period 1996–2007, to identify local clustering of mortality among the eight villages in the observed population. Design: Mortality data and person-time of observation were collected annually in an open-cohort population of approximately 8,000 people over 12 years. Poisson regression modelling and space–time clustering analyses were used to identify possible clustering of mortality. Results: Similar patterns of mortality clustering emerged from Poisson regression and space–time clustering analyses after allowing for age and sex. There was no appreciable clustering of mortality among children under 15 years of age nor in adults 50 years and over. For adults aged 15–49 years, there were substantial clustering effects both in time and in space, with mortality increasing during the period observed and particularly so in some locations, which were nearer to local conurbations. Mortality was relatively lower in the vicinity of the local health centre. Conclusions: Although cause-specific mortality data were not available, the rise in mortality in the 15–49-year age group over time and in areas closer to conurbations strongly suggests that the clustering observed was due to the development of HIV/AIDS-related mortality, as seen similarly elsewhere in South Africa. The HIV/AIDS services offered by the local health centre may have contributed to lower relative mortality around that location.

  6. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  7. Comparison of Three Plot Selection Methods for Estimating Change in Temporally Variable, Spatially Clustered Populations.

    Energy Technology Data Exchange (ETDEWEB)

    Thompson, William L. [Bonneville Power Administration, Portland, OR (US). Environment, Fish and Wildlife

    2001-07-01

    Monitoring population numbers is important for assessing trends and meeting various legislative mandates. However, sampling across time introduces a temporal aspect to survey design in addition to the spatial one. For instance, a sample that is initially representative may lose this attribute if there is a shift in numbers and/or spatial distribution in the underlying population that is not reflected in later sampled plots. Plot selection methods that account for this temporal variability will produce the best trend estimates. Consequently, I used simulation to compare bias and relative precision of estimates of population change among stratified and unstratified sampling designs based on permanent, temporary, and partial replacement plots under varying levels of spatial clustering, density, and temporal shifting of populations. Permanent plots produced more precise estimates of change than temporary plots across all factors. Further, permanent plots performed better than partial replacement plots except for high density (5 and 10 individuals per plot) and 25% - 50% shifts in the population. Stratified designs always produced less precise estimates of population change for all three plot selection methods, and often produced biased change estimates and greatly inflated variance estimates under sampling with partial replacement. Hence, stratification that remains fixed across time should be avoided when monitoring populations that are likely to exhibit large changes in numbers and/or spatial distribution during the study period. Key words: bias; change estimation; monitoring; permanent plots; relative precision; sampling with partial replacement; temporary plots.

  8. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    OpenAIRE

    Corrigan, Neil; Bankart, Michael J G; Gray, Laura J; Smith, Karen L

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed ...

  9. MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

    Directory of Open Access Journals (Sweden)

    Jana Halčinová

    2014-06-01

    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  10. Advanced analysis of forest fire clustering

    Science.gov (United States)

    Kanevski, Mikhail; Pereira, Mario; Golay, Jean

    2017-04-01

    Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index

  11. Cluster Analysis in Rapeseed (Brassica Napus L.)

    International Nuclear Information System (INIS)

    Mahasi, J.M

    2002-01-01

    With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical

  12. Chaotic map clustering algorithm for EEG analysis

    Science.gov (United States)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  13. Clustering Analysis within Text Classification Techniques

    Directory of Open Access Journals (Sweden)

    Madalina ZURINI

    2011-01-01

    Full Text Available The paper represents a personal approach upon the main applications of classification which are presented in the area of knowledge based society by means of methods and techniques widely spread in the literature. Text classification is underlined in chapter two where the main techniques used are described, along with an integrated taxonomy. The transition is made through the concept of spatial representation. Having the elementary elements of geometry and the artificial intelligence analysis, spatial representation models are presented. Using a parallel approach, spatial dimension is introduced in the process of classification. The main clustering methods are described in an aggregated taxonomy. For an example, spam and ham words are clustered and spatial represented, when the concepts of spam, ham and common and linkage word are presented and explained in the xOy space representation.

  14. Tweets clustering using latent semantic analysis

    Science.gov (United States)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  15. Fuzzy cluster analysis of high-field functional MRI data.

    Science.gov (United States)

    Windischberger, Christian; Barth, Markus; Lamm, Claus; Schroeder, Lee; Bauer, Herbert; Gur, Ruben C; Moser, Ewald

    2003-11-01

    Functional magnetic resonance imaging (fMRI) based on blood-oxygen level dependent (BOLD) contrast today is an established brain research method and quickly gains acceptance for complementary clinical diagnosis. However, neither the basic mechanisms like coupling between neuronal activation and haemodynamic response are known exactly, nor can the various artifacts be predicted or controlled. Thus, modeling functional signal changes is non-trivial and exploratory data analysis (EDA) may be rather useful. In particular, identification and separation of artifacts as well as quantification of expected, i.e. stimulus correlated, and novel information on brain activity is important for both, new insights in neuroscience and future developments in functional MRI of the human brain. After an introduction on fuzzy clustering and very high-field fMRI we present several examples where fuzzy cluster analysis (FCA) of fMRI time series helps to identify and locally separate various artifacts. We also present and discuss applications and limitations of fuzzy cluster analysis in very high-field functional MRI: differentiate temporal patterns in MRI using (a) a test object with static and dynamic parts, (b) artifacts due to gross head motion artifacts. Using a synthetic fMRI data set we quantitatively examine the influences of relevant FCA parameters on clustering results in terms of receiver-operator characteristics (ROC) and compare them with a commonly used model-based correlation analysis (CA) approach. The application of FCA in analyzing in vivo fMRI data is shown for (a) a motor paradigm, (b) data from multi-echo imaging, and (c) a fMRI study using mental rotation of three-dimensional cubes. We found that differentiation of true "neural" from false "vascular" activation is possible based on echo time dependence and specific activation levels, as well as based on their signal time-course. Exploratory data analysis methods in general and fuzzy cluster analysis in particular may

  16. Temporal and spatial evolution of discrete auroral arcs as seen by Cluster

    Directory of Open Access Journals (Sweden)

    S. Figueiredo

    2005-10-01

    Full Text Available Two event studies are presented in this paper where intense convergent electric fields, with mapped intensities up to 1350 mV/m, are measured in the auroral upward current region by the Cluster spacecraft, at altitudes between 3 and 5 Earth radii. Both events are from May 2003, Southern Hemisphere, with equatorward crossings by the Cluster spacecraft of the pre-midnight auroral oval. Event 1 occurs during the end of the recovery phase of a strong substorm. A system of auroral arcs associated with convergent electric field structures, with a maximum perpendicular potential drop of about ~10 kV, and upflowing field-aligned currents with densities of 3 µA/m2 (mapped to the ionosphere, was detected at the boundary between the Plasma Sheet Boundary Layer (PSBL and the Plasma Sheet (PS. The auroral arc structures evolve in shape and in magnitude on a timescale of tens of minutes, merging, broadening and intensifying, until finally fading away after about 50 min. Throughout this time, both the PS region and the auroral arc structure in its poleward part remain relatively fixed in space, reflecting the rather quiet auroral conditions during the end of the substorm. The auroral upward acceleration region is shown for this event to extend beyond 3.9 Earth radii altitude. Event 2 occurs during a more active period associated with the expansion phase of a moderate substorm. Images from the Defense Meteorological Satellite Program (DMSP F13 spacecraft show that the Cluster spacecraft crossed the horn region of a surge-type aurora. Conjugated with the Cluster spacecraft crossing above the surge horn, the South Pole All Sky Imager recorded the motion and the temporal evolution of an east-west aligned auroral arc, 30 to 50 km wide. Intense electric field variations are measured by the Cluster spacecraft when crossing above the auroral arc structure, collocated with the density gradient at the PS poleward boundary, and coupled to intense upflowing field

  17. Natural Time and Nowcasting Earthquakes: Are Large Global Earthquakes Temporally Clustered?

    Science.gov (United States)

    Luginbuhl, Molly; Rundle, John B.; Turcotte, Donald L.

    2018-01-01

    The objective of this paper is to analyze the temporal clustering of large global earthquakes with respect to natural time, or interevent count, as opposed to regular clock time. To do this, we use two techniques: (1) nowcasting, a new method of statistically classifying seismicity and seismic risk, and (2) time series analysis of interevent counts. We chose the sequences of M_{λ } ≥ 7.0 and M_{λ } ≥ 8.0 earthquakes from the global centroid moment tensor (CMT) catalog from 2004 to 2016 for analysis. A significant number of these earthquakes will be aftershocks of the largest events, but no satisfactory method of declustering the aftershocks in clock time is available. A major advantage of using natural time is that it eliminates the need for declustering aftershocks. The event count we utilize is the number of small earthquakes that occur between large earthquakes. The small earthquake magnitude is chosen to be as small as possible, such that the catalog is still complete based on the Gutenberg-Richter statistics. For the CMT catalog, starting in 2004, we found the completeness magnitude to be M_{σ } ≥ 5.1 . For the nowcasting method, the cumulative probability distribution of these interevent counts is obtained. We quantify the distribution using the exponent, β , of the best fitting Weibull distribution; β = 1 for a random (exponential) distribution. We considered 197 earthquakes with M_{λ } ≥ 7.0 and found β = 0.83 ± 0.08 . We considered 15 earthquakes with M_{λ } ≥ 8.0, but this number was considered too small to generate a meaningful distribution. For comparison, we generated synthetic catalogs of earthquakes that occur randomly with the Gutenberg-Richter frequency-magnitude statistics. We considered a synthetic catalog of 1.97 × 10^5 M_{λ } ≥ 7.0 earthquakes and found β = 0.99 ± 0.01 . The random catalog converted to natural time was also random. We then generated 1.5 × 10^4 synthetic catalogs with 197 M_{λ } ≥ 7.0 in each catalog

  18. Natural Time and Nowcasting Earthquakes: Are Large Global Earthquakes Temporally Clustered?

    Science.gov (United States)

    Luginbuhl, Molly; Rundle, John B.; Turcotte, Donald L.

    2018-02-01

    The objective of this paper is to analyze the temporal clustering of large global earthquakes with respect to natural time, or interevent count, as opposed to regular clock time. To do this, we use two techniques: (1) nowcasting, a new method of statistically classifying seismicity and seismic risk, and (2) time series analysis of interevent counts. We chose the sequences of M_{λ } ≥ 7.0 and M_{λ } ≥ 8.0 earthquakes from the global centroid moment tensor (CMT) catalog from 2004 to 2016 for analysis. A significant number of these earthquakes will be aftershocks of the largest events, but no satisfactory method of declustering the aftershocks in clock time is available. A major advantage of using natural time is that it eliminates the need for declustering aftershocks. The event count we utilize is the number of small earthquakes that occur between large earthquakes. The small earthquake magnitude is chosen to be as small as possible, such that the catalog is still complete based on the Gutenberg-Richter statistics. For the CMT catalog, starting in 2004, we found the completeness magnitude to be M_{σ } ≥ 5.1. For the nowcasting method, the cumulative probability distribution of these interevent counts is obtained. We quantify the distribution using the exponent, β, of the best fitting Weibull distribution; β = 1 for a random (exponential) distribution. We considered 197 earthquakes with M_{λ } ≥ 7.0 and found β = 0.83 ± 0.08. We considered 15 earthquakes with M_{λ } ≥ 8.0, but this number was considered too small to generate a meaningful distribution. For comparison, we generated synthetic catalogs of earthquakes that occur randomly with the Gutenberg-Richter frequency-magnitude statistics. We considered a synthetic catalog of 1.97 × 10^5 M_{λ } ≥ 7.0 earthquakes and found β = 0.99 ± 0.01. The random catalog converted to natural time was also random. We then generated 1.5 × 10^4 synthetic catalogs with 197 M_{λ } ≥ 7.0 in each catalog and

  19. Statistical trend analysis methods for temporal phenomena

    International Nuclear Information System (INIS)

    Lehtinen, E.; Pulkkinen, U.; Poern, K.

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods

  20. EM cluster analysis for categorical data

    Czech Academy of Sciences Publication Activity Database

    Grim, Jiří

    2006-01-01

    Roč. 44, č. 4109 (2006), s. 640-648 ISSN 0302-9743. [Joint IAPR International Workshops SSPR 2006 and SPR 2006. Hong Kong , 17.08.2006-19.08.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : cluster analysis * categorical data * EM algorithm Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.402, year: 2005

  1. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    Science.gov (United States)

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  2. Adaptive Fuzzy Consensus Clustering Framework for Clustering Analysis of Cancer Data.

    Science.gov (United States)

    Yu, Zhiwen; Chen, Hantao; You, Jane; Liu, Jiming; Wong, Hau-San; Han, Guoqiang; Li, Le

    2015-01-01

    Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.

  3. A Temporal Extension to Traditional Empirical Orthogonal Function Analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Hilger, Klaus Baggesen; Andersen, Ole Baltazar

    2002-01-01

    This paper describes the application of temporal maximum autocorrelation factor analysis to global monthly mean values of 1996-1997 sea surface temperature (SST) and sea surface height (SSH) data. This type of analysis can be considered as an extension of traditional empirical orthogonal function...... (EOF) analysis, which provides a non-temporal analysis of one variable over time. The temporal extension proves its strength in separating the signals at different periods in an analysis of relevant oceanographic properties related to one of the largest El Niño events ever recorded....

  4. Epidemiological features and spatio-temporal clusters of hand-foot-mouth disease at town level in Fuyang, Anhui Province, China (2008-2013).

    Science.gov (United States)

    Mao, Y J; Sun, L; Xie, J G; Yau, K K W

    2016-11-01

    Hand-foot-mouth disease (HFMD) is a frequently occurring epidemic and has been an important cause of childhood mortality in China. Given the disease's significant impact nationwide, the epidemiological characteristics and spatio-temporal clusters in Fuyang from 2008 to 2013 were analysed in this study. The disease exhibits strong seasonality with a rising incidence. Of the reported HFMD cases, 63·7% were male and 95·2% were preschool children living at home. The onset of HFMD is age-dependent and exhibits a 12-month periodicity, with 12-, 24- and 36-month-old children being the most frequently affected groups. Across the first 60 months of life, children born in April [relative risk (RR) 8·18], May (RR 9·79) and June (RR 8·21) exhibited an elevated infection risk of HFMD relative to January-born children; the relative risk compared with the reference (January-born) group was highest for children aged 24 months born in May (RR 34·85). Of laboratory-confirmed cases, enterovirus 71 (EV71), coxsackie A16 (Cox A16) and other enteroviruses accounted for 60·1%, 7·1% and 32·8%, respectively. Spatio-temporal analysis identified one most likely cluster and several secondary clusters each year. The centre of the most likely cluster was found in different regions in Fuyang. Implications of our findings for current and future public health interventions are discussed.

  5. Temporal and spatial evolution of discrete auroral arcs as seen by Cluster

    Directory of Open Access Journals (Sweden)

    S. Figueiredo

    2005-10-01

    Full Text Available Two event studies are presented in this paper where intense convergent electric fields, with mapped intensities up to 1350 mV/m, are measured in the auroral upward current region by the Cluster spacecraft, at altitudes between 3 and 5 Earth radii. Both events are from May 2003, Southern Hemisphere, with equatorward crossings by the Cluster spacecraft of the pre-midnight auroral oval.

    Event 1 occurs during the end of the recovery phase of a strong substorm. A system of auroral arcs associated with convergent electric field structures, with a maximum perpendicular potential drop of about ~10 kV, and upflowing field-aligned currents with densities of 3 µA/m2 (mapped to the ionosphere, was detected at the boundary between the Plasma Sheet Boundary Layer (PSBL and the Plasma Sheet (PS. The auroral arc structures evolve in shape and in magnitude on a timescale of tens of minutes, merging, broadening and intensifying, until finally fading away after about 50 min. Throughout this time, both the PS region and the auroral arc structure in its poleward part remain relatively fixed in space, reflecting the rather quiet auroral conditions during the end of the substorm. The auroral upward acceleration region is shown for this event to extend beyond 3.9 Earth radii altitude.

    Event 2 occurs during a more active period associated with the expansion phase of a moderate substorm. Images from the Defense Meteorological Satellite Program (DMSP F13 spacecraft show that the Cluster spacecraft crossed the horn region of a surge-type aurora. Conjugated with the Cluster spacecraft crossing above the surge horn, the South Pole All Sky Imager recorded the motion and the temporal evolution of an east-west aligned auroral arc, 30 to 50 km wide. Intense electric field variations are measured by the Cluster spacecraft when crossing above the auroral arc structure, collocated with the

  6. Application of cluster analysis for data driven market segmentation ...

    African Journals Online (AJOL)

    This research work is all out to capture: which standard of application of cluster analysis have emerged in the academic marketing literature, compare their standards of applying the methodological knowledge about clustering procedures and delineate sudden changes in clustering habits. These goals are achieved by ...

  7. Cluster analysis of word frequency dynamics

    Science.gov (United States)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  8. Cluster analysis of word frequency dynamics

    International Nuclear Information System (INIS)

    Maslennikova, Yu S; Bochkarev, V V; Belashova, I A

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations

  9. Clustering of Multi-Temporal Fully Polarimetric L-Band SAR Data for Agricultural Land Cover Mapping

    Science.gov (United States)

    Tamiminia, H.; Homayouni, S.; Safari, A.

    2015-12-01

    Recently, the unique capabilities of Polarimetric Synthetic Aperture Radar (PolSAR) sensors make them an important and efficient tool for natural resources and environmental applications, such as land cover and crop classification. The aim of this paper is to classify multi-temporal full polarimetric SAR data using kernel-based fuzzy C-means clustering method, over an agricultural region. This method starts with transforming input data into the higher dimensional space using kernel functions and then clustering them in the feature space. Feature space, due to its inherent properties, has the ability to take in account the nonlinear and complex nature of polarimetric data. Several SAR polarimetric features extracted using target decomposition algorithms. Features from Cloude-Pottier, Freeman-Durden and Yamaguchi algorithms used as inputs for the clustering. This method was applied to multi-temporal UAVSAR L-band images acquired over an agricultural area near Winnipeg, Canada, during June and July in 2012. The results demonstrate the efficiency of this approach with respect to the classical methods. In addition, using multi-temporal data in the clustering process helped to investigate the phenological cycle of plants and significantly improved the performance of agricultural land cover mapping.

  10. From virtual clustering analysis to self-consistent clustering analysis: a mathematical study

    Science.gov (United States)

    Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam

    2018-03-01

    In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.

  11. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

    Directory of Open Access Journals (Sweden)

    Huanhuan Li

    2017-08-01

    Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our

  12. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis.

    Science.gov (United States)

    Li, Huanhuan; Liu, Jingxian; Liu, Ryan Wen; Xiong, Naixue; Wu, Kefeng; Kim, Tai-Hoon

    2017-08-04

    The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with

  13. CLUSTERING ANALYSIS OF OFFICER'S BEHAVIOURS IN LONDON POLICE FOOT PATROL ACTIVITIES

    Directory of Open Access Journals (Sweden)

    J. Shen

    2015-07-01

    Full Text Available In this small paper we aim at presenting a framework of conceptual representation and clustering analysis of police officers’ patrol pattern obtained from mining their raw movement trajectory data. This have been achieved by a model developed to accounts for the spatio-temporal dynamics human movements by incorporating both the behaviour features of the travellers and the semantic meaning of the environment they are moving in. Hence, the similarity metric of traveller behaviours is jointly defined according to the stay time allocation in each Spatio-temporal region of interests (ST-ROI to support clustering analysis of patrol behaviours. The proposed framework enables the analysis of behaviour and preferences on higher level based on raw moment trajectories. The model is firstly applied to police patrol data provided by the Metropolitan Police and will be tested by other type of dataset afterwards.

  14. An analysis of hospital brand mark clusters.

    Science.gov (United States)

    Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan

    2010-07-01

    This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.

  15. Characterizing the Spatio-Temporal Pattern of Land Surface Temperature through Time Series Clustering: Based on the Latent Pattern and Morphology

    Directory of Open Access Journals (Sweden)

    Huimin Liu

    2018-04-01

    Full Text Available Land Surface Temperature (LST is a critical component to understand the impact of urbanization on the urban thermal environment. Previous studies were inclined to apply only one snapshot to analyze the pattern and dynamics of LST without considering the non-stationarity in the temporal domain, or focus on the diurnal, seasonal, and annual pattern analysis of LST which has limited support for the understanding of how LST varies with the advancing of urbanization. This paper presents a workflow to extract the spatio-temporal pattern of LST through time series clustering by focusing on the LST of Wuhan, China, from 2002 to 2017 with a 3-year time interval with 8-day MODerate-resolution Imaging Spectroradiometer (MODIS satellite image products. The Latent pattern of LST (LLST generated by non-parametric Multi-Task Gaussian Process Modeling (MTGP and the Multi-Scale Shape Index (MSSI which characterizes the morphology of LLST are coupled for pattern recognition. Specifically, spatio-temporal patterns are discovered after the extraction of spatial patterns conducted by the incorporation of k -means and the Back-Propagation neural networks (BP-Net. The spatial patterns of the 6 years form a basic understanding about the corresponding temporal variances. For spatio-temporal pattern recognition, LLSTs and MSSIs of the 6 years are regarded as geo-referenced time series. Multiple algorithms including traditional k -means with Euclidean Distance (ED, shape-based k -means with the constrained Dynamic Time Warping ( c DTW distance measure, and the Dynamic Time Warping Barycenter Averaging (DBA centroid computation method ( k - c DBA and k -shape are applied. Ten external indexes are employed to evaluate the performance of the three algorithms and reveal k - c DBA as the optimal time series clustering algorithm for our study. The study area is divided into 17 geographical time series clusters which respectively illustrate heterogeneous temporal dynamics of LST

  16. Smartness and Italian Cities. A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Flavio Boscacci

    2014-05-01

    Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a

  17. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  18. The Psychology of Yoga Practitioners: A Cluster Analysis.

    Science.gov (United States)

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-11-01

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall -Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  19. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    Science.gov (United States)

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  20. Using Cluster Analysis for Data Mining in Educational Technology Research

    Science.gov (United States)

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  1. Cluster analysis of residential heat load profiles and the role of technical and household characteristics

    DEFF Research Database (Denmark)

    Carmo, Carolina; Christensen, Toke Haunstrup

    2016-01-01

    of the temporality of the energy demand is needed. This paper contributes to this by focusing on the daily load profiles of energy demand for heating of Danish dwellings with heat pumps. Based on hourly recordings from 139 dwellings and employing cluster and regression analysis, the paper explores patterns...... (typologies) in daily heating load profiles and how these relate to socio-economic and technical characteristics of the included households. The study shows that the load profiles vary according to the external load conditions. Two main clusters were identified for both weekdays and weekends and across load...

  2. Spatio-temporal analysis of smear-positive tuberculosis in the Sidama Zone, southern Ethiopia.

    Directory of Open Access Journals (Sweden)

    Mesay Hailu Dangisso

    Full Text Available Tuberculosis (TB is a disease of public health concern, with a varying distribution across settings depending on socio-economic status, HIV burden, availability and performance of the health system. Ethiopia is a country with a high burden of TB, with regional variations in TB case notification rates (CNRs. However, TB program reports are often compiled and reported at higher administrative units that do not show the burden at lower units, so there is limited information about the spatial distribution of the disease. We therefore aim to assess the spatial distribution and presence of the spatio-temporal clustering of the disease in different geographic settings over 10 years in the Sidama Zone in southern Ethiopia.A retrospective space-time and spatial analysis were carried out at the kebele level (the lowest administrative unit within a district to identify spatial and space-time clusters of smear-positive pulmonary TB (PTB. Scan statistics, Global Moran's I, and Getis and Ordi (Gi* statistics were all used to help analyze the spatial distribution and clusters of the disease across settings.A total of 22,545 smear-positive PTB cases notified over 10 years were used for spatial analysis. In a purely spatial analysis, we identified the most likely cluster of smear-positive PTB in 192 kebeles in eight districts (RR= 2, p<0.001, with 12,155 observed and 8,668 expected cases. The Gi* statistic also identified the clusters in the same areas, and the spatial clusters showed stability in most areas in each year during the study period. The space-time analysis also detected the most likely cluster in 193 kebeles in the same eight districts (RR= 1.92, p<0.001, with 7,584 observed and 4,738 expected cases in 2003-2012.The study found variations in CNRs and significant spatio-temporal clusters of smear-positive PTB in the Sidama Zone. The findings can be used to guide TB control programs to devise effective TB control strategies for the geographic areas

  3. Visual cluster analysis and pattern recognition methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    2001-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  4. Applying temporal network analysis to the venture capital market

    Science.gov (United States)

    Zhang, Xin; Feng, Ling; Zhu, Rongqian; Stanley, H. Eugene

    2015-10-01

    Using complex network theory to study the investment relationships of venture capital firms has produced a number of significant results. However, previous studies have often neglected the temporal properties of those relationships, which in real-world scenarios play a pivotal role. Here we examine the time-evolving dynamics of venture capital investment in China by constructing temporal networks to represent (i) investment relationships between venture capital firms and portfolio companies and (ii) the syndication ties between venture capital investors. The evolution of the networks exhibits rich variations in centrality, connectivity and local topology. We demonstrate that a temporal network approach provides a dynamic and comprehensive analysis of real-world networks.

  5. Genetic analysis of loose cluster architecture in grapevine

    Directory of Open Access Journals (Sweden)

    Richter Robert

    2017-01-01

    Full Text Available Loose cluster architecture is a well known trait supporting Botrytis resilience by permitting a faster drying of bunches. Furthermore, a loose bunch enables a better application of fungicides into the cluster. The analysis of 150 F1 plants of the superior breeding line GF.GA-47-42 (‘Bacchus' x ‘Seyval blanc' crossed with ‘Villard blanc' segregating for compactness of the cluster was used for QTL analysis. Plenty of QTL were identified reproducibly for two years, QTLs stable over three growing seasons were identified for rachis length, peduncle length, and pedicel length. In a second approach ‘Pinot noir' clones showing variation for cluster architecture were analyzed for differential gene expression. Grown in three different German viticultural areas, loose versus compact clustered ‘Pinot noir' clones showed in gene expression experiments a candidate gene expressed fivefold higher in loosely clustered clones between stages BBCH57 and BBCH71.

  6. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

    Science.gov (United States)

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

    2017-01-01

    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  7. EM Clustering Analysis of Diabetes Patients Basic Diagnosis Index

    OpenAIRE

    Wu, Cai; Steinbauer, Jeffrey R.; Kuo, Grace M.

    2005-01-01

    Cluster analysis can group similar instances into same group. Partitioning cluster assigns classes to samples without known the classes in advance. Most common algorithms are K-means and Expectation Maximization (EM). EM clustering algorithm can find number of distributions of generating data and build “mixture models”. It identifies groups that are either overlapping or varying sizes and shapes. In this project, by using EM in Machine Learning Algorithm in JAVA (WEKA) syste...

  8. Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.

    Science.gov (United States)

    Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun

    2017-12-01

    Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.

  9. A spatio-temporal analysis of suicide in El Salvador

    Directory of Open Access Journals (Sweden)

    Carlos Carcach

    2017-04-01

    Full Text Available Abstract Background In 2012, international statistics showed El Salvador’s suicide rate as 40th in the world and the highest in Latin America. Over the last 15 years, national statistics show the suicide death rate declining as opposed to an increasing rate of homicide. Though completed suicide is an important social and health issue, little is known about its prevalence, incidence, etiology and spatio-temporal behavior. The primary objective of this study was to examine completed suicide and homicide using the stream analogy to lethal violence within a spatio-temporal framework. Methods A Bayesian model was applied to examine the spatio-temporal evolution of the tendency of completed suicide over homicide in El Salvador. Data on numbers of suicides and homicides at the municipal level were obtained from the Instituto de Medicina Legal (IML and population counts, from the Dirección General de Estadística y Censos (DIGESTYC, for the period of 2002 to 2012. Data on migration were derived from the 2007 Population Census, and inequality data were obtained from a study by Damianović, Valenzuela and Vera. Results The data reveal a stable standardized rate of total lethal violence (completed suicide plus homicide across municipalities over time; a decline in suicide; and a standardized suicide rate decreasing with income inequality but increasing with social isolation. Municipalities clustered in terms of both total lethal violence and suicide standardized rates. Conclusions Spatial effects for suicide were stronger among municipalities located in the north-east and center-south sides of the country. New clusters of municipalities with large suicide standardized rates were detected in the north-west, south-west and center-south regions, all of which are part of time-stable clusters of homicide. Prevention efforts to reduce income inequality and mitigate the negative effects of weak relational systems should focus upon municipalities forming time

  10. A spatio-temporal analysis of suicide in El Salvador.

    Science.gov (United States)

    Carcach, Carlos

    2017-04-20

    In 2012, international statistics showed El Salvador's suicide rate as 40th in the world and the highest in Latin America. Over the last 15 years, national statistics show the suicide death rate declining as opposed to an increasing rate of homicide. Though completed suicide is an important social and health issue, little is known about its prevalence, incidence, etiology and spatio-temporal behavior. The primary objective of this study was to examine completed suicide and homicide using the stream analogy to lethal violence within a spatio-temporal framework. A Bayesian model was applied to examine the spatio-temporal evolution of the tendency of completed suicide over homicide in El Salvador. Data on numbers of suicides and homicides at the municipal level were obtained from the Instituto de Medicina Legal (IML) and population counts, from the Dirección General de Estadística y Censos (DIGESTYC), for the period of 2002 to 2012. Data on migration were derived from the 2007 Population Census, and inequality data were obtained from a study by Damianović, Valenzuela and Vera. The data reveal a stable standardized rate of total lethal violence (completed suicide plus homicide) across municipalities over time; a decline in suicide; and a standardized suicide rate decreasing with income inequality but increasing with social isolation. Municipalities clustered in terms of both total lethal violence and suicide standardized rates. Spatial effects for suicide were stronger among municipalities located in the north-east and center-south sides of the country. New clusters of municipalities with large suicide standardized rates were detected in the north-west, south-west and center-south regions, all of which are part of time-stable clusters of homicide. Prevention efforts to reduce income inequality and mitigate the negative effects of weak relational systems should focus upon municipalities forming time-persistent clusters with a large rate of death by suicide. In

  11. fMR-adaptation indicates selectivity to audiovisual content congruency in distributed clusters in human superior temporal cortex

    Directory of Open Access Journals (Sweden)

    Blomert Leo

    2010-02-01

    Full Text Available Abstract Background Efficient multisensory integration is of vital importance for adequate interaction with the environment. In addition to basic binding cues like temporal and spatial coherence, meaningful multisensory information is also bound together by content-based associations. Many functional Magnetic Resonance Imaging (fMRI studies propose the (posterior superior temporal cortex (STC as the key structure for integrating meaningful multisensory information. However, a still unanswered question is how superior temporal cortex encodes content-based associations, especially in light of inconsistent results from studies comparing brain activation to semantically matching (congruent versus nonmatching (incongruent multisensory inputs. Here, we used fMR-adaptation (fMR-A in order to circumvent potential problems with standard fMRI approaches, including spatial averaging and amplitude saturation confounds. We presented repetitions of audiovisual stimuli (letter-speech sound pairs and manipulated the associative relation between the auditory and visual inputs (congruent/incongruent pairs. We predicted that if multisensory neuronal populations exist in STC and encode audiovisual content relatedness, adaptation should be affected by the manipulated audiovisual relation. Results The results revealed an occipital-temporal network that adapted independently of the audiovisual relation. Interestingly, several smaller clusters distributed over superior temporal cortex within that network, adapted stronger to congruent than to incongruent audiovisual repetitions, indicating sensitivity to content congruency. Conclusions These results suggest that the revealed clusters contain multisensory neuronal populations that encode content relatedness by selectively responding to congruent audiovisual inputs, since unisensory neuronal populations are assumed to be insensitive to the audiovisual relation. These findings extend our previously revealed mechanism for

  12. Entropic Approach to Multiscale Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Antonio Insolia

    2012-05-01

    Full Text Available Recently, a novel method has been introduced to estimate the statistical significance of clustering in the direction distribution of objects. The method involves a multiscale procedure, based on the Kullback–Leibler divergence and the Gumbel statistics of extreme values, providing high discrimination power, even in presence of strong background isotropic contamination. It is shown that the method is: (i semi-analytical, drastically reducing computation time; (ii very sensitive to small, medium and large scale clustering; (iii not biased against the null hypothesis. Applications to the physics of ultra-high energy cosmic rays, as a cosmological probe, are presented and discussed.

  13. Temporal Data-Driven Sleep Scheduling and Spatial Data-Driven Anomaly Detection for Clustered Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Gang Li

    2016-09-01

    Full Text Available The spatial–temporal correlation is an important feature of sensor data in wireless sensor networks (WSNs. Most of the existing works based on the spatial–temporal correlation can be divided into two parts: redundancy reduction and anomaly detection. These two parts are pursued separately in existing works. In this work, the combination of temporal data-driven sleep scheduling (TDSS and spatial data-driven anomaly detection is proposed, where TDSS can reduce data redundancy. The TDSS model is inspired by transmission control protocol (TCP congestion control. Based on long and linear cluster structure in the tunnel monitoring system, cooperative TDSS and spatial data-driven anomaly detection are then proposed. To realize synchronous acquisition in the same ring for analyzing the situation of every ring, TDSS is implemented in a cooperative way in the cluster. To keep the precision of sensor data, spatial data-driven anomaly detection based on the spatial correlation and Kriging method is realized to generate an anomaly indicator. The experiment results show that cooperative TDSS can realize non-uniform sensing effectively to reduce the energy consumption. In addition, spatial data-driven anomaly detection is quite significant for maintaining and improving the precision of sensor data.

  14. Spatio-Temporal Distribution Characteristics and Trajectory Similarity Analysis of Tuberculosis in Beijing, China

    Directory of Open Access Journals (Sweden)

    Lan Li

    2016-03-01

    Full Text Available Tuberculosis (TB is an infectious disease with one of the highest reported incidences in China. The detection of the spatio-temporal distribution characteristics of TB is indicative of its prevention and control conditions. Trajectory similarity analysis detects variations and loopholes in prevention and provides urban public health officials and related decision makers more information for the allocation of public health resources and the formulation of prioritized health-related policies. This study analysed the spatio-temporal distribution characteristics of TB from 2009 to 2014 by utilizing spatial statistics, spatial autocorrelation analysis, and space-time scan statistics. Spatial statistics measured the TB incidence rate (TB patients per 100,000 residents at the district level to determine its spatio-temporal distribution and to identify characteristics of change. Spatial autocorrelation analysis was used to detect global and local spatial autocorrelations across the study area. Purely spatial, purely temporal and space-time scan statistics were used to identify purely spatial, purely temporal and spatio-temporal clusters of TB at the district level. The other objective of this study was to compare the trajectory similarities between the incidence rates of TB and new smear-positive (NSP TB patients in the resident population (NSPRP/new smear-positive TB patients in the TB patient population (NSPTBP/retreated smear-positive (RSP TB patients in the resident population (RSPRP/retreated smear-positive TB patients in the TB patient population (RSPTBP to detect variations and loopholes in TB prevention and control among the districts in Beijing. The incidence rates in Beijing exhibited a gradual decrease from 2009 to 2014. Although global spatial autocorrelation was not detected overall across all of the districts of Beijing, individual districts did show evidence of local spatial autocorrelation: Chaoyang and Daxing were Low-Low districts over

  15. Spatio-Temporal Distribution Characteristics and Trajectory Similarity Analysis of Tuberculosis in Beijing, China.

    Science.gov (United States)

    Li, Lan; Xi, Yuliang; Ren, Fu

    2016-03-07

    Tuberculosis (TB) is an infectious disease with one of the highest reported incidences in China. The detection of the spatio-temporal distribution characteristics of TB is indicative of its prevention and control conditions. Trajectory similarity analysis detects variations and loopholes in prevention and provides urban public health officials and related decision makers more information for the allocation of public health resources and the formulation of prioritized health-related policies. This study analysed the spatio-temporal distribution characteristics of TB from 2009 to 2014 by utilizing spatial statistics, spatial autocorrelation analysis, and space-time scan statistics. Spatial statistics measured the TB incidence rate (TB patients per 100,000 residents) at the district level to determine its spatio-temporal distribution and to identify characteristics of change. Spatial autocorrelation analysis was used to detect global and local spatial autocorrelations across the study area. Purely spatial, purely temporal and space-time scan statistics were used to identify purely spatial, purely temporal and spatio-temporal clusters of TB at the district level. The other objective of this study was to compare the trajectory similarities between the incidence rates of TB and new smear-positive (NSP) TB patients in the resident population (NSPRP)/new smear-positive TB patients in the TB patient population (NSPTBP)/retreated smear-positive (RSP) TB patients in the resident population (RSPRP)/retreated smear-positive TB patients in the TB patient population (RSPTBP) to detect variations and loopholes in TB prevention and control among the districts in Beijing. The incidence rates in Beijing exhibited a gradual decrease from 2009 to 2014. Although global spatial autocorrelation was not detected overall across all of the districts of Beijing, individual districts did show evidence of local spatial autocorrelation: Chaoyang and Daxing were Low-Low districts over the six

  16. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    Science.gov (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  17. The Reliability of Inverse Screen Tests for Cluster Analysis.

    Science.gov (United States)

    Lathrop, Richard G.; Williams, Janice E.

    1987-01-01

    A Monte Carlo study, involving 6,000 "computer subjects" and three raters, explored the reliability of the inverse screen test for cluster analysis. Results indicate that the inverse screen may be a useful and reliable cluster analytic technique for determining the number of true groups. (TJH)

  18. Blaeu: Mapping and navigating large tables with cluster analysis

    NARCIS (Netherlands)

    T.H.J. Sellam (Thibault); C.P. Cijvat (Robin); R.A. Koopmanschap (Richard); M.L. Kersten (Martin)

    2016-01-01

    textabstractBlaeu is an interactive database exploration tool. Its aim is to guide casual users through large data tables, ultimately triggering insights and serendipity. To do so, it relies on a double cluster analysis mechanism. It clusters the data vertically: it detects themes, groups of

  19. SUPPLY CHAIN ANALYSIS AND PERFORMANCE ASSESSMENT OF SME FISHERIES CLUSTERS

    Directory of Open Access Journals (Sweden)

    Anton Agus Setyawan

    2017-12-01

    Full Text Available Study of SME in Indonesia related with business networks and performance in these business organizations. In many cases, regional administration in Indonesia develops SME business network in the form of clusters. This study analyzes SME fisheries clusters with supply chain analysis.  We also develop performance assessment of SME fisheries cluster by using multivariate model. This study involves 62 SMEs in Sragen, Central Java Indonesia. Those SMEs  includes in fisheries cluster in the area. Our findings show that SME fisheries cluster has in-efficient supply chain. This business clusters has problems in profit setting and delivery time which harm their performance. We measure business performance by using business selling, profit rate and asset growth. We found that cost structure, man power and physical production has positive effects to business performance.

  20. Merging Galaxy Clusters: Analysis of Simulated Analogs

    Science.gov (United States)

    Nguyen, Jayke; Wittman, David; Cornell, Hunter

    2018-01-01

    The nature of dark matter can be better constrained by observing merging galaxy clusters. However, uncertainty in the viewing angle leads to uncertainty in dynamical quantities such as 3-d velocities, 3-d separations, and time since pericenter. The classic timing argument links these quantities via equations of motion, but neglects effects of nonzero impact parameter (i.e. it assumes velocities are parallel to the separation vector), dynamical friction, substructure, and larger-scale environment. We present a new approach using n-body cosmological simulations that naturally incorporate these effects. By uniformly sampling viewing angles about simulated cluster analogs, we see projected merger parameters in the many possible configurations of a given cluster. We select comparable simulated analogs and evaluate the likelihood of particular merger parameters as a function of viewing angle. We present viewing angle constraints for a sample of observed mergers including the Bullet cluster and El Gordo, and show that the separation vectors are closer to the plane of the sky than previously reported.

  1. Genotypic stability and clustering analysis of confectionery ...

    African Journals Online (AJOL)

    Nine groundnut genotypes were evaluated in terminal moisture-stress areas of northeastern Ethiopia during 2005 and 2006 cropping seasons with the objective of analyzing genotypic stability and clustering of confectionery groundnut for seed and protein yield. The genotypes were evaluated on a plot size of 15 m2 at Kobo ...

  2. Heart morphogenesis gene regulatory networks revealed by temporal expression analysis.

    Science.gov (United States)

    Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph

    2017-10-01

    During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.

  3. Three-dimensional temporal reconstruction and analysis of plume images

    Science.gov (United States)

    Dhawan, Atam P.; Disimile, Peter J.; Peck, Charles, III

    1992-01-01

    An experiment with two subsonic jets generating a cross-flow was conducted as part of a study of the structural features of temporal reconstruction of plume images. The flow field structure was made visible using a direct injection flow visualization technique. It is shown that image analysis and temporal three-dimensional visualization can provide new information on the vortical structural dynamics of multiple jets in a cross-flow. It is expected that future developments in image analysis, quantification and interpretation, and flow visualization of rocket engine plume images may provide a tool for correlating the engine diagnostic features by interpreting the evolution of the structures in the plume.

  4. Prognostic value of cluster analysis of severe asthma phenotypes.

    Science.gov (United States)

    Bourdin, Arnaud; Molinari, Nicolas; Vachier, Isabelle; Varrin, Muriel; Marin, Grégory; Gamez, Anne-Sophie; Paganin, Fabrice; Chanez, Pascal

    2014-11-01

    Cross-sectional severe asthma cluster analysis identified different phenotypes. We tested the hypothesis that these clusters will follow different courses. We aimed to identify which asthma outcomes are specific and coherently associated with these different phenotypes in a prospective longitudinal cohort. In a longitudinal cohort of 112 patients with severe asthma, the 5 Severe Asthma Research Program (SARP) clusters were identified by means of algorithm application. Because patients of the present cohort all had severe asthma compared with the SARP cohort, homemade clusters were identified and also tested. At the subsequent visit, we investigated several outcomes related to asthma control at 1 year (6-item Asthma Control Questionnaire [ACQ-6], lung function, and medication requirement) and then recorded the 3-year exacerbations rate and time to first exacerbation. The SARP algorithm discriminated the 5 clusters at entry for age, asthma duration, lung function, blood eosinophil measurement, ACQ-6 scores, and diabetes comorbidity. Four homemade clusters were mostly segregated by best ever achieved FEV1 values and discriminated the groups by a few clinical characteristics. Nonetheless, all these clusters shared similar asthma outcomes related to asthma control as follows. The ACQ-6 score did not change in any cluster. Exacerbation rate and time to first exacerbation were similar, as were treatment requirements. Severe asthma phenotypes identified by using a previously reported cluster analysis or newly homemade clusters do not behave differently concerning asthma control-related outcomes, which are used to assess the response to innovative therapies. This study demonstrates a potential limitation of the cluster analysis approach in the field of severe asthma. Copyright © 2014. Published by Elsevier Inc.

  5. Analysis of Aspects of Innovation in a Brazilian Cluster

    Directory of Open Access Journals (Sweden)

    Adriana Valélia Saraceni

    2012-09-01

    Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.

  6. Describing the homeless mentally ill: cluster analysis results.

    Science.gov (United States)

    Mowbray, C T; Bybee, D; Cohen, E

    1993-02-01

    Presented descriptive data on a group of homeless, mentally ill individuals (N = 108) served by a two-site demonstration project, funded by NIMH. Comparing results with those from other studies of this population produced some differences and some similarities. Cluster analysis techniques were applied to the data, producing a 4-group solution. Data validating the cluster solution are presented. It is suggested that the cluster results provide a more meaningful and useful method of understanding the descriptive data. Results suggest that while the population of individuals served as homeless and mentally ill is quite heterogeneous, many have well-developed functioning skills--only one cluster, making up 35.2% of the sample, fits the stereotype of the aggressive, psychotic individual with skill deficits in many areas. Further discussion is presented concerning the implications of the cluster analysis results for demonstrating contextual effects and thus better interpreting research results from other studies and assisting in future services planning.

  7. Sensitization trajectories in childhood revealed by using a cluster analysis.

    Science.gov (United States)

    Schoos, Ann-Marie M; Chawes, Bo L; Melén, Erik; Bergström, Anna; Kull, Inger; Wickman, Magnus; Bønnelykke, Klaus; Bisgaard, Hans; Rasmussen, Morten A

    2017-12-01

    Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more biologically and clinically relevant. We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. We investigated 398 children from the at-risk Copenhagen Prospective Studies on Asthma in Childhood 2000 (COPSAC 2000 ) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent patterns explicitly characterizing temporal development of sensitization while clustering allergens and children. Subsequently, these patterns were investigated in relation to asthma, rhinitis, and eczema. Verification was sought in an independent unselected birth cohort (BAMSE) constituting 3051 children with specific IgE against the same allergens at 4 and 8 years of age. The nonnegative sparse parallel factor analysis indicated a complex latent structure involving 7 age- and allergen-specific patterns in the COPSAC 2000 birth cohort data: (1) dog/cat/horse, (2) timothy grass/birch, (3) molds, (4) house dust mites, (5) peanut/wheat flour/mugwort, (6) peanut/soybean, and (7) egg/milk/wheat flour. Asthma was solely associated with pattern 1 (odds ratio [OR], 3.3; 95% CI, 1.5-7.2), rhinitis with patterns 1 to 4 and 6 (OR, 2.2-4.3), and eczema with patterns 1 to 3 and 5 to 7 (OR, 1.6-2.5). All 7 patterns were verified in the independent BAMSE cohort (R 2  > 0.89). This study suggests the presence of specific sensitization patterns in early childhood differentially associated with development of

  8. ENTREPRENEURIAL ACTIVITY IN ROMANIA – A TIME SERIES CLUSTERING ANALYSIS AT THE NUTS3 LEVEL

    Directory of Open Access Journals (Sweden)

    Sipos-Gug Sebastian

    2013-07-01

    Full Text Available Entrepreneurship is an active field of research, having known a major increase in interest and publication levels in the last years (Landström et al., 2012. Within this field recently there has been an increasing interest in understanding why some regions seem to have a significantly higher entrepreneurship activity compared to others. In line with this research field, we would like to investigate the differences in entrepreneurial activity among the Romanian counties (NUTS 3 regions. While the classical research paradigm in this field is to conduct a temporally stationary analysis, we choose to use a time series clustering analysis to better understanding the dynamics of entrepreneurial activity between counties. Our analysis showed that if we use the total number of new privately owned companies that are founded each year in the last decade (2002-2012 we can distinguish between 5 clusters, one with high total entrepreneurial activity (18 counties, one with above average activity (8 counties, two clusters with average and slightly below average activity (total of 18 counties and one cluster with low and declining activity (2 counties. If we are interested in the entrepreneurial activity rate, that is the number of new privately owned companies founded each year adjusted by the population of the respective county, we obtain 4 clusters, one with a very high entrepreneurial rate (1 county, one with average rate (10 counties, and two clusters with below average entrepreneurial rate (total of 31 counties. In conclusion, our research shows that Romania is far from being a homogeneous geographical area in respect to entrepreneurial activity. Depending on what we are interested in, it can be divided in 5 or 4 clusters of counties, which behave differently as a function of time. Further research should be focused on explaining these regional differences, on studying the high performance clusters and trying to improve the low performing ones.

  9. Foot and mouth disease in Zambia: Spatial and temporal distributions of outbreaks, assessment of clusters and implications for control

    Directory of Open Access Journals (Sweden)

    Yona Sinkala

    2014-04-01

    Full Text Available Zambia has been experiencing low livestock productivity as well as trade restrictions owing to the occurrence of foot and mouth disease (FMD, but little is known about the epidemiology of the disease in these endemic settings. The fundamental questions relate to the spatio-temporal distribution of FMD cases and what determines their occurrence. A retrospective review of FMD cases in Zambia from 1981 to 2012 was conducted using geographical information systems and the SaTScan software package. Information was collected from peer-reviewed journal articles, conference proceedings, laboratory reports, unpublished scientific reports and grey literature. A space–time permutation probability model using a varying time window of one year was used to scan for areas with high infection rates. The spatial scan statistic detected a significant purely spatial cluster around the Mbala–Isoka area between 2009 and 2012, with secondary clusters in Sesheke–Kazungula in 2007 and 2008, the Kafue flats in 2004 and 2005 and Livingstone in 2012. This study provides evidence of the existence of statistically significant FMD clusters and an increase in occurrence in Zambia between 2004 and 2012. The identified clusters agree with areas known to be at high risk of FMD. The FMD virus transmission dynamics and the heterogeneous variability in risk within these locations may need further investigation.

  10. Foot and mouth disease in Zambia: spatial and temporal distributions of outbreaks, assessment of clusters and implications for control.

    Science.gov (United States)

    Sinkala, Yona; Simuunza, Martin; Muma, John B; Pfeiffer, Dirk U; Kasanga, Christopher J; Mweene, Aaron

    2014-04-23

    Zambia has been experiencing low livestock productivity as well as trade restrictions owing to the occurrence of foot and mouth disease (FMD), but little is known about the epidemiology of the disease in these endemic settings. The fundamental questions relate to the spatio-temporal distribution of FMD cases and what determines their occurrence. A retrospective review of FMD cases in Zambia from 1981 to 2012 was conducted using geographical information systems and the SaTScan software package. Information was collected from peer-reviewed journal articles, conference proceedings, laboratory reports, unpublished scientific reports and grey literature. A space-time permutation probability model using a varying time window of one year was used to scan for areas with high infection rates. The spatial scan statistic detected a significant purely spatial cluster around the Mbala-Isoka area between 2009 and 2012, with secondary clusters in Sesheke-Kazungula in 2007 and 2008, the Kafue flats in 2004 and 2005 and Livingstone in 2012. This study provides evidence of the existence of statistically significant FMD clusters and an increase in occurrence in Zambia between 2004 and 2012. The identified clusters agree with areas known to be at high risk of FMD. The FMD virus transmission dynamics and the heterogeneous variability in risk within these locations may need further investigation.

  11. Multi-scale visual analysis of time-varying electrocorticography data via clustering of brain regions.

    Science.gov (United States)

    Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; Dougherty, Max; Hamann, Bernd; Weber, Gunther H

    2017-06-06

    There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is of high spatio-temporal resolution. Comprehensive analysis of this data should be supported by interactive visual analysis methods that allow a scientist to understand functional patterns at varying levels of granularity and comprehend its time-varying behavior. We introduce a novel multi-scale visual analysis system, ECoG ClusterFlow, for the detailed exploration of ECoG data. Our system detects and visualizes dynamic high-level structures, such as communities, derived from the time-varying connectivity network. The system supports two major views: 1) an overview summarizing the evolution of clusters over time and 2) an electrode view using hierarchical glyph-based design to visualize the propagation of clusters in their spatial, anatomical context. We present case studies that were performed in collaboration with neuroscientists and neurosurgeons using simulated and recorded epileptic seizure data to demonstrate our system's effectiveness. ECoG ClusterFlow supports the comparison of spatio-temporal patterns for specific time intervals and allows a user to utilize various clustering algorithms. Neuroscientists can identify the site of seizure genesis and its spatial progression during various the stages of a seizure. Our system serves as a fast and powerful means for the generation of preliminary hypotheses that can be used as a basis for subsequent application of rigorous statistical methods, with the ultimate goal being the clinical treatment of epileptogenic zones.

  12. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  13. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

    Science.gov (United States)

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

    2018-04-01

    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  14. Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia

    Directory of Open Access Journals (Sweden)

    Nazarudin Safian

    2008-09-01

    Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS

  15. The AIDS epidemic and economic input impact factors in Chongqing, China, from 2006 to 2012: a spatial-temporal analysis.

    Science.gov (United States)

    Zhang, Yanqi; Xiao, Qin; Zhou, Liang; Ma, Dihui; Liu, Ling; Lu, Rongrong; Yi, Dali; Yi, Dong

    2015-03-27

    To analyse the spatial-temporal clustering of the HIV/AIDS epidemic in Chongqing and to explore its association with the economic indices of AIDS prevention and treatment. Data on the HIV/AIDS epidemic and economic indices of AIDS prevention and treatment were obtained from the annual reports of the Chongqing Municipal Center for Disease Control for 2006-2012. Spatial clustering analysis, temporal-spatial clustering analysis, and spatial regression were used to conduct statistical analysis. The annual average new HIV infection rate, incidence rate for new AIDS cases, and rate of people living with HIV in Chongqing were 5.97, 2.42 and 28.12 per 100,000, respectively, for 2006-2012. The HIV/AIDS epidemic showed a non-random spatial distribution (Moran's I≥0.310; p<0.05). The epidemic hotspots were distributed in the 15 mid-western counties. The most likely clusters were primarily located in the central region and southwest of Chongqing and occurred in 2010-2012. The regression coefficients of the total amount of special funds allocated to AIDS and to the public awareness unit for the numbers of new HIV cases, new AIDS cases, and people living with HIV were 0.775, 0.976 and 0.816, and -0.188, -0.259 and -0.215 (p<0.002), respectively. The Chongqing HIV/AIDS epidemic showed temporal-spatial clustering and was mainly clustered in the mid-western and south-western counties, showing an upward trend over time. The amount of special funds dedicated to AIDS and to the public awareness unit showed positive and negative relationships with HIV/AIDS spatial clustering, respectively. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  16. Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

    Science.gov (United States)

    de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

    2006-01-01

    K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…

  17. Analysis of the temporal evolution of total column nitrogen dioxide ...

    African Journals Online (AJOL)

    Concurrent measurement and analysis of Nitrogen dioxide (NO2)and Ozone (O3) are essential for improved understanding of ozone distribution. This study sought to analyse the temporal evolution of total column NO2 and O3 over Nairobi using satellite-derived daily data between 2009 and 2013. Seasonality is observed ...

  18. Spatial and temporal analysis of mass movement using dendrochronology

    NARCIS (Netherlands)

    Braam, R.R.; Weiss, E.E.J.; Burrough, P.A.

    1987-01-01

    Tree growth and inclination on sloping land is affected by mass movement. Suitable analysis of tree growth and tree form can therefore provide considerable information on mass movement activity. This paper reports a new, automated method for studying the temporal and spatial aspects of mass

  19. Spatial and temporal expression analysis of D- myo -inositol 3 ...

    African Journals Online (AJOL)

    1á (eukaryotic elongation factor 1-alpha) using SYBER-Green. The qRT-PCR data analysis indicated that the expression of the four highly conserved MIPS genes is both temporally and spatially regulated, information much needed for reverse ...

  20. Variability of Soil Temperature: A Spatial and Temporal Analysis.

    Science.gov (United States)

    Walsh, Stephen J.; And Others

    1991-01-01

    Discusses an analysis of the relationship of soil temperatures at 3 depths to various climatic variables along a 200-kilometer transect in west-central Oklahoma. Reports that temperature readings increased from east to west. Concludes that temperature variations were explained by a combination of spatial, temporal, and biophysical factors. (SG)

  1. Dengue fever occurrence and vector detection by larval survey, ovitrap and MosquiTRAP: a space-time clusters analysis.

    Science.gov (United States)

    de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

    2012-01-01

    The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools.

  2. Dengue Fever Occurrence and Vector Detection by Larval Survey, Ovitrap and MosquiTRAP: A Space-Time Clusters Analysis

    Science.gov (United States)

    de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

    2012-01-01

    The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools. PMID:22848729

  3. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  4. Exploratory Analysis of Spatial-Temporal Patterns of Air Pollution in the City

    Science.gov (United States)

    Champendal, Alexandre; Kanevski, Mikhail; Huguenot, Pierre-Emmanuel; Golay, Jean

    2013-04-01

    Air pollution in the city is an important problem influencing environment, well-being of society, economy, management of urban zones, etc. The problem is extremely difficult due to a very complex distribution of the pollution sources, morphology of the city and dispersion processes leading to multivariate nature of the phenomena and high local spatial-temporal variability. The task of understanding, modelling and prediction of spatial-temporal patterns of air pollution in urban zones is an interesting and challenging topic having many research axes from science-based modelling to geostatistics and data mining. The present research mainly deals with a comprehensive exploratory analysis of spatial-temporal air pollution data using statistical, geostatistical and machine learning tools. This analysis helps to 1) understand and model spatial-temporal correlations using variography, 2) explore the temporal evolution of spatial correlation matrix; 3) analyse and visualize an interconnection between measurement stations using network science tools; 4) quantify the availability and predictability of structured patterns. The real data case study deals with spatial-temporal air pollution data of canton Geneva (2002-2011). Carbon dioxide (NO2) have caught our attention. It has effects on health: nitrogen dioxide can irritate the lungs, effects on plants; NO2 contributes to the phenomenon of acid rain. The negative effects of nitrogen dioxides on plants are reducing the growth, production and pesticide resistance. And finally the effects on materials: nitrogen dioxides increase the corrosion. Well-defined patterns of spatial-temporal correlations were detected. The analysis and visualization of spatial correlation matrix for 91 stations were carried out using the network science tools and high levels of clustering were revealed. Moving Window Correlation Matrix and Spatio-temporal variography methods were applied to define and explore the dynamic of our data. More than just

  5. Centrality measures in temporal networks with time series analysis

    Science.gov (United States)

    Huang, Qiangjuan; Zhao, Chengli; Zhang, Xue; Wang, Xiaojie; Yi, Dongyun

    2017-05-01

    The study of identifying important nodes in networks has a wide application in different fields. However, the current researches are mostly based on static or aggregated networks. Recently, the increasing attention to networks with time-varying structure promotes the study of node centrality in temporal networks. In this paper, we define a supra-evolution matrix to depict the temporal network structure. With using of the time series analysis, the relationships between different time layers can be learned automatically. Based on the special form of the supra-evolution matrix, the eigenvector centrality calculating problem is turned into the calculation of eigenvectors of several low-dimensional matrices through iteration, which effectively reduces the computational complexity. Experiments are carried out on two real-world temporal networks, Enron email communication network and DBLP co-authorship network, the results of which show that our method is more efficient at discovering the important nodes than the common aggregating method.

  6. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  7. A Distributed Flocking Approach for Information Stream Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  8. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  9. Effective and efficient analysis of spatio-temporal data

    Science.gov (United States)

    Zhang, Zhongnan

    Spatio-temporal data mining, i.e., mining knowledge from large amount of spatio-temporal data, is a highly demanding field because huge amounts of spatio-temporal data have been collected in various applications, ranging from remote sensing, to geographical information systems (GIS), computer cartography, environmental assessment and planning, etc. The collection data far exceeded human's ability to analyze which make it crucial to develop analysis tools. Recent studies on data mining have extended to the scope of data mining from relational and transactional datasets to spatial and temporal datasets. Among the various forms of spatio-temporal data, remote sensing images play an important role, due to the growing wide-spreading of outer space satellites. In this dissertation, we proposed two approaches to analyze the remote sensing data. The first one is about applying association rules mining onto images processing. Each image was divided into a number of image blocks. We built a spatial relationship for these blocks during the dividing process. This made a large number of images into a spatio-temporal dataset since each image was shot in time-series. The second one implemented co-occurrence patterns discovery from these images. The generated patterns represent subsets of spatial features that are located together in space and time. A weather analysis is composed of individual analysis of several meteorological variables. These variables include temperature, pressure, dew point, wind, clouds, visibility and so on. Local-scale models provide detailed analysis and forecasts of meteorological phenomena ranging from a few kilometers to about 100 kilometers in size. When some of above meteorological variables have some special change tendency, some kind of severe weather will happen in most cases. Using the discovery of association rules, we found that some special meteorological variables' changing has tight relation with some severe weather situation that will happen

  10. Development of small scale cluster computer for numerical analysis

    Science.gov (United States)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

    2017-09-01

    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  11. Fuzzy clustering analysis to study geomagnetic coastal effects

    Directory of Open Access Journals (Sweden)

    M. Sridharan

    2005-06-01

    Full Text Available The utility of fuzzy set theory in cluster analysis and pattern recognition has been evolving since the mid 1960s, in conjunction with the emergence and evolution of computer technology. The classification of objects into categories is the subject of cluster analysis. The aim of this paper is to employ Fuzzy-clustering technique to examine the interrelationship of geomagnetic coastal and other effects at Indian observatories. Data from the observatories used for the present studies are from Alibag on the West Coast, Visakhapatnam and Pondicherry on the East Coast, Hyderabad and Nagpur as central inland stations which are located far from either of the coasts; all the above stations are free from the influence of the daytime equatorial electrojet. It has been found that Alibag and Pondicherry Observatories form a separate cluster showing anomalous variations in the vertical (Z-component. H- and D-components form different clusters. The results are compared with the graphical method. Analytical technique and the results of Fuzzy-clustering analysis are discussed here.

  12. Using ICD for structural analysis of clusters: a case study on NeAr clusters

    Science.gov (United States)

    Fasshauer, E.; Förstel, M.; Pallmann, S.; Pernpointner, M.; Hergenhahn, U.

    2014-10-01

    We present a method to utilize interatomic Coulombic decay (ICD) to retrieve information about the mean geometric structures of heteronuclear clusters. It is based on observation and modelling of competing ICD channels, which involve the same initial vacancy, but energetically different final states with vacancies in different components of the cluster. Using binary rare gas clusters of Ne and Ar as an example, we measure the relative intensity of ICD into (Ne+)2 and Ne+Ar+ final states with spectroscopically well separated ICD peaks. We compare in detail the experimental ratios of the Ne-Ne and Ne-Ar ICD contributions and their positions and widths to values calculated for a diverse set of possible structures. We conclude that NeAr clusters exhibit a core-shell structure with an argon core surrounded by complete neon shells and, possibly, further an incomplete shell of neon atoms for the experimental conditions investigated. Our analysis allows one to differentiate between clusters of similar size and stochiometric Ar content, but different internal structure. We find evidence for ICD of Ne 2s-1, producing Ar+ vacancies in the second coordination shell of the initial site.

  13. Patterns of urban violent injury: a spatio-temporal analysis.

    Directory of Open Access Journals (Sweden)

    Michael Cusimano

    2010-01-01

    Full Text Available Injury related to violent acts is a problem in every society. Although some authors have examined the geography of violent crime, few have focused on the spatio-temporal patterns of violent injury and none have used an ambulance dataset to explore the spatial characteristics of injury. The purpose of this study was to describe the combined spatial and temporal characteristics of violent injury in a large urban centre.Using a geomatics framework and geographic information systems software, we studied 4,587 ambulance dispatches and 10,693 emergency room admissions for violent injury occurrences among adults (aged 18-64 in Toronto, Canada, during 2002 and 2004, using population-based datasets. We created kernel density and choropleth maps for 24-hour periods and four-hour daily time periods and compared location of ambulance dispatches and patient residences with local land use and socioeconomic characteristics. We used multivariate regressions to control for confounding factors. We found the locations of violent injury and the residence locations of those injured were both closely related to each other and clearly clustered in certain parts of the city characterised by high numbers of bars, social housing units, and homeless shelters, as well as lower household incomes. The night and early morning showed a distinctive peak in injuries and a shift in the location of injuries to a "nightlife" district. The locational pattern of patient residences remained unchanged during those times.Our results demonstrate that there is a distinctive spatio-temporal pattern in violent injury reflected in the ambulance data. People injured in this urban centre more commonly live in areas of social deprivation. During the day, locations of injury and locations of residences are similar. However, later at night, the injury location of highest density shifts to a "nightlife" district, whereas the residence locations of those most at risk of injury do not change.

  14. Feature-space clustering for fMRI meta-analysis

    DEFF Research Database (Denmark)

    Goutte, C; Hansen, L.K.; G. Liptrot, Matthew

    2001-01-01

    of a clustering method applied to features extracted from the data. This approach is extremely versatile and encompasses previously published results [Goutte et al., 1999] as special cases. A typical application is in data reduction: as the increase in temporal resolution of fMRI experiments routinely yields f......MRI sequences containing several hundreds of images, it is sometimes necessary to invoke feature extraction to reduce the dimensionality of the data space. A second interesting application is in the meta-analysis of fMRI experiment, where features are obtained from a possibly large number of single......-voxel analyses. In particular this allows the checking of the differences and agreements between different methods of analysis. Both approaches are illustrated on a fMRI data set involving visual stimulation, and we show that the feature space clustering approach yields nontrivial results and, in particular...

  15. Cluster Analytical Method of Fault Risk Analysis in Systems

    Science.gov (United States)

    Michaľčonok, German; Horalová Kalinová, Michaela

    2016-12-01

    In providing safety functions, the proposal of safety functions of control systems is an important part of a risk reduction strategy. In the specification of security requirements, it is necessary to determine and document individual characteristics and the desired performance level for each safety. This article presents the results of the experiment cluster analysis. The results of the experiment prove that the methods of cluster analysis provide a suitable tool for analyzing the reliability of safety systems analysis. Regarding the increasing complexity of the systems, we can state that the application of these methods in the subject area is a good choice.

  16. Some Linguistic-based and temporal analysis on Wikipedia

    International Nuclear Information System (INIS)

    Yasseri, T.

    2010-01-01

    Wikipedia as a web-based, collaborative, multilingual encyclopaedia project is a very suitable field to carry out research on social dynamics and to investigate the complex concepts of conflict, collaboration, competition, dispute, etc in a large community (∼26 Million) of Wikipedia users. The other face of Wikipedia as a productive society, is its output, consisting of (∼17) Millions of articles written unsupervised by unprofessional editors in more than 270 different languages. In this talk we report some analysis performed on Wikipedia in two different approaches: temporal analysis to characterize disputes and controversies among users and linguistic-based analysis to characterize linguistic features of English texts in Wikipedia. (author)

  17. Temporal abstraction for the analysis of intensive care information

    International Nuclear Information System (INIS)

    Hadad, Alejandro J; Evin, Diego A; Drozdowicz, Bartolome; Chiotti, Omar

    2007-01-01

    This paper proposes a scheme for the analysis of time-stamped series data from multiple monitoring devices of intensive care units, using Temporal Abstraction concepts. This scheme is oriented to obtain a description of the patient state evolution in an unsupervised way. The case of study is based on a dataset clinically classified with Pulmonary Edema. For this dataset a trends based Temporal Abstraction mechanism is proposed, by means of a Behaviours Base of time-stamped series and then used in a classification step. Combining this approach with the introduction of expert knowledge, using Fuzzy Logic, and multivariate analysis by means of Self-Organizing Maps, a states characterization model is obtained. This model is feasible of being extended to different patients groups and states. The proposed scheme allows to obtain intermediate states descriptions through which it is passing the patient and that could be used to anticipate alert situations

  18. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

    Science.gov (United States)

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

    2016-01-01

    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  19. Proteome Profiling of Vitreoretinal Diseases by Cluster Analysis

    OpenAIRE

    Shitama, Tomomi; Hayashi, Hideyuki; Noge, Sumiyo; Uchio, Eiichi; Oshima, Kenji; Haniu, Hisao; Takemori, Nobuaki; Komori, Naoka; Matsumoto, Hiroyuki

    2008-01-01

    Vitreous samples collected in retinopathic surgeries have diverse properties, making proteomics analysis difficult. We report a cluster analysis to evade this difficulty. Vitreous and subretinal fluid samples were collected from 60 patients during surgical operation of non-proliferative diabetic retinopathy, proliferative diabetic retinopathy, proliferative vitreoretinopathy, and rhegmatogenous retinal detachment. For controls we collected vitreous fluid from patients of idiopathic macular ho...

  20. Temporal patterns analysis of rat behavior in hole-board.

    Science.gov (United States)

    Casarrubea, Maurizio; Sorbera, Filippina; Magnusson, Magnus; Crescimanno, Giuseppe

    2010-03-17

    The aim of present research was to analyze the temporal structure of rodent's anxiety-related behavior in hole-board apparatus (HB). Fifteen male Wistar rats were tested for 10 min. Video files, collected for each subject, were coded by means of a software coder and event log files generated for each subject. To assess temporal relationships among behavioral events, log files were processed by means of a t-pattern analysis. 14 two-element t-patterns, four t-patterns encompassing 3 events and 2 t-patterns encompassing 4 and 5 events respectively were revealed. It was demonstrated that rat behavior in HB was mainly structured on the basis of the temporal patterning among exploratory events; these ones were the most structured t-patterns detected and appeared mainly during the first 5 min of exploration, while grooming t-patterns were present prevalently after the fifth minute. Specific t-pattern parameters, such as overall occurrences and mean duration of each given t-pattern in each subject, were also studied. Present research: (a) reports for the first time that some behavioral events occur sequentially and with significant constraints on the interval lengths separating them; (b) presents the temporal flows of some behavioral elements through multimodal behavioral vectors; (c) could also be used to improve HB test reliability and its ability to detect even small induced behavioral changes. Copyright 2009 Elsevier B.V. All rights reserved.

  1. Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

    Directory of Open Access Journals (Sweden)

    Wessel Jens

    2009-07-01

    Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.

  2. Breast cancer clustering in Kanagawa, Japan: a geographic analysis.

    Science.gov (United States)

    Katayama, Kayoko; Yokoyama, Kazuhito; Yako-Suketomo, Hiroko; Okamoto, Naoyuki; Tango, Toshiro; Inaba, Yutaka

    2014-01-01

    The purpose of the present study was to determine geographic clustering of breast cancer incidence in Kanagawa Prefecture, using cancer registry data. The study also aimed at examining the association between socio-economic factors and any identified cluster. Incidence data were collected for women who were first diagnosed with breast cancer during the period from January to December 2006 in Kanagawa. The data consisted of 2,326 incidence cases extracted from the total of 34,323 Kanagawa Cancer Registration data issued in 2011. To adjust for differences in age distribution, the standardized mortality ratio (SMR) and the standardized incidence ratio (SIR) of breast cancer were calculated for each of 56 municipalities (e.g., city, special ward, town, and village) in Kanagawa by an indirect method using Kanagawa female population data. Spatial scan statistics were used to detect any area of elevated risk as a cluster for breast cancer deaths and/ or incidences. The Student t-test was performed to examine differences in socio-economic variables, viz, persons per household, total fertility rate, age at first marriage for women, and marriage rate, between cluster and other regions. There was a statistically significant cluster of breast cancer incidence (p=0.001) composed of 11 municipalities in southeastern area of Kanagawa Prefecture, whose SIR was 35 percent higher than that of the remainder of Kanagawa Prefecture. In this cluster, average value of age at first-marriage for women was significantly higher than in the rest of Kanagawa (p=0.017). No statistically significant clusters of breast cancer deaths were detected (p=0.53). There was a statistically significant cluster of high breast cancer incidence in southeastern area of Kanagawa Prefecture. It was suggested that the cluster region was related to the tendency to marry later. This study methodology will be helpful in the analysis of geographical disparities in cancer deaths and incidence.

  3. Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis

    Directory of Open Access Journals (Sweden)

    Gabjo Kim

    2016-12-01

    Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.

  4. clusters

    Indian Academy of Sciences (India)

    2017-09-27

    Sep 27, 2017 ... while CuCoNO, Co3NO, Cu3CoNO, Cu2Co3NO, Cu3Co3NO and Cu6CoNO clusters display stronger chemical stability. Magnetic and electronic properties are also discussed. The magnetic moment is affected by charge transfer and the spd hybridization. Keywords. CumConNO (m + n = 2–7) clusters; ...

  5. Spatio-temporal analysis of wildfire ignitions in the St. Johns River Water Management District, Florida

    Science.gov (United States)

    Marc G. Genton; David T. Butry; Marcia L. Gumpertz; Jeffrey P. Prestemon

    2006-01-01

    We analyse the spatio-temporal structure of wildfire ignitions in the St. Johns River Water Management District in north-eastern Florida. We show, using tools to analyse point patterns (e.g. the L-function), that wildfire events occur in clusters. Clustering of these events correlates with irregular distribution of fire ignitions, including lightning...

  6. Spatial distribution and temporal evolution of DRONPA-fused SNAP25 clusters in adrenal chromaffin cells

    DEFF Research Database (Denmark)

    Antoku, Yasuko; Dedecker, Peter; da Silva Pinheiro, Paulo César

    2015-01-01

    fluorescence bursts of DRONPA-fused SNAP-25 molecules in live chromaffin cells by Total Internal Reflection Fluorescence (TIRF) imaging. We find that this method allows tracking protein cluster dynamics over relatively long times (∼20 min.), partly due to the diffusion into the TIRF field of fresh molecules......Sub-diffraction imaging of plasma membrane localized proteins, such as the SNARE (Soluble NSF Attachment Protein Receptor) proteins involved in exocytosis, in fixed cells have resulted in images with high spatial resolution, at the expense of dynamical information. Here, we have imaged localized...

  7. Traffic Accident, System Model and Cluster Analysis in GIS

    Directory of Open Access Journals (Sweden)

    Veronika Vlčková

    2015-07-01

    Full Text Available One of the many often frequented topics as normal journalism, so the professional public, is the problem of traffic accidents. This article illustrates the orientation of considerations to a less known context of accidents, with the help of constructive systems theory and its methods, cluster analysis and geoinformation engineering. Traffic accident is reframing the space-time, and therefore it can be to study with tools of technology of geographic information systems. The application of system approach enabling the formulation of the system model, grabbed by tools of geoinformation engineering and multicriterial and cluster analysis.

  8. Application of microarray analysis on computer cluster and cloud platforms.

    Science.gov (United States)

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  9. Spectro-Temporal Analysis of Speech for Spanish Phoneme Recognition

    DEFF Research Database (Denmark)

    Sharifzadeh, Sara; Serrano, Javier; Carrabina, Jordi

    2012-01-01

    are considered. This has improved the recognition performance especially in case of noisy situation and phonemes with time domain modulations such as stops. In this method, the 2D Discrete Cosine Transform (DCT) is applied on small overlapped 2D Hamming windowed patches of spectrogram of Spanish phonemes......State of the art speech recognition systems (ASR), mostly use Mel-Frequency cepstral coefficients (MFCC), as acoustic features. In this paper, we propose a new discriminative analysis of acoustic features, based on spectrogram analysis. Both spectral and temporal variations of speech signal...

  10. Identifying clinical course patterns in SMS data using cluster analysis.

    Science.gov (United States)

    Kent, Peter; Kongsted, Alice

    2012-07-02

    Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed by spline analysis. However, cluster analysis of SMS data in its original untransformed form may be simpler and offer other advantages. Therefore, the aim of this study was to determine whether cluster analysis could be used for identifying clinical course patterns distinct from the pattern of the whole group, by including all SMS time points in their original form. It was a 'proof of concept' study to explore the potential, clinical relevance, strengths and weakness of such an approach. This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials conducted simultaneously from a single clinical population (n = 322). Fortnightly SMS data collected over a year on 'days of problematic low back pain' and on 'days of sick leave' were analysed using Two-Step (probabilistic) Cluster Analysis. Clinical course patterns were identified that were clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis. This study showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there

  11. Acoustic Cluster Therapy: In Vitro and Ex Vivo Measurement of Activated Bubble Size Distribution and Temporal Dynamics.

    Science.gov (United States)

    Healey, Andrew John; Sontum, Per Christian; Kvåle, Svein; Eriksen, Morten; Bendiksen, Ragnar; Tornes, Audun; Østensen, Jonny

    2016-05-01

    Acoustic cluster technology (ACT) is a two-component, microparticle formulation platform being developed for ultrasound-mediated drug delivery. Sonazoid microbubbles, which have a negative surface charge, are mixed with micron-sized perfluoromethylcyclopentane droplets stabilized with a positively charged surface membrane to form microbubble/microdroplet clusters. On exposure to ultrasound, the oil undergoes a phase change to the gaseous state, generating 20- to 40-μm ACT bubbles. An acoustic transmission technique is used to measure absorption and velocity dispersion of the ACT bubbles. An inversion technique computes bubble size population with temporal resolution of seconds. Bubble populations are measured both in vitro and in vivo after activation within the cardiac chambers of a dog model, with catheter-based flow through an extracorporeal measurement flow chamber. Volume-weighted mean diameter in arterial blood after activation in the left ventricle was 22 μm, with no bubbles >44 μm in diameter. After intravenous administration, 24.4% of the oil is activated in the cardiac chambers. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  12. Applied Hierarchical Cluster Analysis with Average Linkage Algoritm

    Directory of Open Access Journals (Sweden)

    Cindy Cahyaning Astuti

    2017-11-01

    Full Text Available This research was conducted in Sidoarjo District where source of data used from secondary data contained in the book "Kabupaten Sidoarjo Dalam Angka 2016" .In this research the authors chose 12 variables that can represent sub-district characteristics in Sidoarjo. The variable that represents the characteristics of the sub-district consists of four sectors namely geography, education, agriculture and industry. To determine the equitable geographical conditions, education, agriculture and industry each district, it would require an analysis to classify sub-districts based on the sub-district characteristics. Hierarchical cluster analysis is the analytical techniques used to classify or categorize the object of each case into a relatively homogeneous group expressed as a cluster. The results are expected to provide information about dominant sub-district characteristics and non-dominant sub-district characteristics in four sectors based on the results of the cluster is formed.

  13. Examining lower urinary tract symptom constellations using cluster analysis.

    Science.gov (United States)

    Coyne, Karin S; Matza, Louis S; Kopp, Zoe S; Thompson, Christine; Henry, David; Irwin, Debra E; Artibani, Walter; Herschorn, Sender; Milsom, Ian

    2008-05-01

    To gain a better understanding of how patients experience lower urinary tract symptoms (LUTS) and to determine whether particular symptoms cluster together, as LUTS seldom occur alone. A secondary analysis of a cross-sectional, population-based survey of adults in Sweden, Italy, Germany, UK and Canada was undertaken to examine the presence of LUTS groups. Of the 19,165 telephone surveys, 13,519 respondents reported at least one LUTS and were included in the analysis. All respondents were asked about the presence of 14 LUTS (International Prostate Symptom Score plus seven additional LUTS). K-means cluster analyses, a statistical method for sorting objects into groups so that similar objects are grouped together, was used to identify groups of people based on their symptoms. Men and women were analysed separately. A split-half random sample was selected from the dataset so that exploratory analyses could be conducted in one half and confirmed in the second. On model confirmation, the sample was analysed in its entirety. Included in this analysis were 5014 men (mean age 49.8 years; 95% white) and 8505 women (mean age 50.4 years; 96% white). Among both men and women, six distinct symptom cluster groups were identified and the symptom patterns of each cluster were examined. For both, the largest cluster consisted of respondents with minimal symptoms (i.e. reporting essentially one symptom), 56% of men and 57% of women. The remaining five clusters for men and women were labelled based on their predominant symptoms. For men, the clusters were nocturia of twice or more per night (12%); terminal dribble (11%); urgency (10%); multiple symptoms (9%); and postvoid incontinence (5%). For women, the clusters were nocturia of twice or more per night (12%); terminal dribble (10%); urgency (8%); stress incontinence (8%); and multiple symptoms (5%). The multiple-symptom groups had several and varied LUTS, were older, and had more comorbidities. Clusters of terminal dribble and male

  14. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal

    2016-02-01

    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  15. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    Science.gov (United States)

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  16. Practice-related changes in neural activation patterns investigated via wavelet-based clustering analysis

    Science.gov (United States)

    Lee, Jinae; Park, Cheolwoo; Dyckman, Kara A.; Lazar, Nicole A.; Austin, Benjamin P.; Li, Qingyang; McDowell, Jennifer E.

    2012-01-01

    Objectives To evaluate brain activation using functional magnetic resonance imaging (fMRI) and specifically, activation changes across time associated with practice-related cognitive control during eye movement tasks. Experimental design Participants were engaged in antisaccade performance (generating a glance away from a cue) while fMR images were acquired during two separate time points: 1) at pre-test before any exposure to the task, and 2) at post-test, after one week of daily practice on antisaccades, prosaccades (glancing towards a target) or fixation (maintaining gaze on a target). Principal observations The three practice groups were compared across the two time points, and analyses were conducted via the application of a model-free clustering technique based on wavelet analysis. This series of procedures was developed to avoid analysis problems inherent in fMRI data and was composed of several steps: detrending, data aggregation, wavelet transform and thresholding, no trend test, principal component analysis and K-means clustering. The main clustering algorithm was built in the wavelet domain to account for temporal correlation. We applied a no trend test based on wavelets to significantly reduce the high dimension of the data. We clustered the thresholded wavelet coefficients of the remaining voxels using the principal component analysis K-means clustering. Conclusion Over the series of analyses, we found that the antisaccade practice group was the only group to show decreased activation from pre- to post-test in saccadic circuitry, particularly evident in supplementary eye field, frontal eye fields, superior parietal lobe, and cuneus. PMID:22505290

  17. Cluster analysis as a prediction tool for pregnancy outcomes.

    Science.gov (United States)

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  18. Language Learner Motivational Types: A Cluster Analysis Study

    Science.gov (United States)

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  19. Characterization of population exposure to organochlorines: A cluster analysis application

    NARCIS (Netherlands)

    R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)

    2013-01-01

    textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues

  20. Cluster analysis for validated climatology stations using precipitation in Mexico

    NARCIS (Netherlands)

    Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.

    2012-01-01

    Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40

  1. cluster

    Indian Academy of Sciences (India)

    has been investigated electrochemically in positive and negative microenvironments, both in solution and in film. Charge nature around the active centre ... in plants, bacteria and also in mammals. This cluster is also an important constituent of a ..... selection of non-cysteine amino acid in the active centre of Rieske proteins.

  2. Multiscale recurrence analysis of spatio-temporal data

    Science.gov (United States)

    Riedl, M.; Marwan, N.; Kurths, J.

    2015-12-01

    The description and analysis of spatio-temporal dynamics is a crucial task in many scientific disciplines. In this work, we propose a method which uses the mapogram as a similarity measure between spatially distributed data instances at different time points. The resulting similarity values of the pairwise comparison are used to construct a recurrence plot in order to benefit from established tools of recurrence quantification analysis and recurrence network analysis. In contrast to other recurrence tools for this purpose, the mapogram approach allows the specific focus on different spatial scales that can be used in a multi-scale analysis of spatio-temporal dynamics. We illustrate this approach by application on mixed dynamics, such as traveling parallel wave fronts with additive noise, as well as more complicate examples, pseudo-random numbers and coupled map lattices with a semi-logistic mapping rule. Especially the complicate examples show the usefulness of the multi-scale consideration in order to take spatial pattern of different scales and with different rhythms into account. So, this mapogram approach promises new insights in problems of climatology, ecology, or medicine.

  3. K-means cluster analysis and seismicity partitioning for Pakistan

    Science.gov (United States)

    Rehman, Khaista; Burton, Paul W.; Weatherill, Graeme A.

    2014-07-01

    Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and

  4. The ROCKSTAR Phase-space Temporal Halo Finder and the Velocity Offsets of Cluster Cores

    Science.gov (United States)

    Behroozi, Peter S.; Wechsler, Risa H.; Wu, Hao-Yi

    2013-01-01

    We present a new algorithm for identifying dark matter halos, substructure, and tidal features. The approach is based on adaptive hierarchical refinement of friends-of-friends groups in six phase-space dimensions and one time dimension, which allows for robust (grid-independent, shape-independent, and noise-resilient) tracking of substructure; as such, it is named ROCKSTAR (Robust Overdensity Calculation using K-Space Topologically Adaptive Refinement). Our method is massively parallel (up to 105 CPUs) and runs on the largest current simulations (>1010 particles) with high efficiency (10 CPU hours and 60 gigabytes of memory required per billion particles analyzed). A previous paper has shown ROCKSTAR to have excellent recovery of halo properties; we expand on these comparisons with more tests and higher-resolution simulations. We show a significant improvement in substructure recovery compared to several other halo finders and discuss the theoretical and practical limits of simulations in this regard. Finally, we present results that demonstrate conclusively that dark matter halo cores are not at rest relative to the halo bulk or substructure average velocities and have coherent velocity offsets across a wide range of halo masses and redshifts. For massive clusters, these offsets can be up to 350 km s-1 at z = 0 and even higher at high redshifts. Our implementation is publicly available at http://code.google.com/p/rockstar.

  5. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    Directory of Open Access Journals (Sweden)

    Jessie J Hsu

    Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  6. EFFICIENCY OF SMES IN ROMANIA POST CRISIS. A CLUSTERING ANALYSIS

    Directory of Open Access Journals (Sweden)

    Cristina SUCIU

    2014-06-01

    Full Text Available Small and medium-sized enterprises (SMEs have had, even in the economic crisis, a major contribution to the achievement of gross domestic product, to create jobs, to increase economic efficiency by stimulating competition through speed of adaptation to conditions and the adoption of new strategies, the ability to adapt to market requirements. Although, at the beginning of the economic crisis in Romania have been suspended or canceled several hundred thousand companies, starting in 2012 it is observed a revival of SMEs. We could say that post crisis period, thanks to measures in support of SMEs, is the beginning of an economic boost of SMEs in Romania. Cluster analysis a multivariate analys is technique, which includes a number of algorithms for classifying objects in to homogeneous groups. Analysis of effectiveness of SMEs from Romania using cluster analysisis a new method of economic analysis which enables an analysis, mathematical methods, regional development of SMEs and increasing their competitiveness.

  7. Spatial and temporal estimation of soil loss for the sustainable management of a wet semi-arid watershed cluster.

    Science.gov (United States)

    Rejani, R; Rao, K V; Osman, M; Srinivasa Rao, Ch; Reddy, K Sammi; Chary, G R; Pushpanjali; Samuel, Josily

    2016-03-01

    The ungauged wet semi-arid watershed cluster, Seethagondi, lies in the Adilabad district of Telangana in India and is prone to severe erosion and water scarcity. The runoff and soil loss data at watershed, catchment, and field level are necessary for planning soil and water conservation interventions. In this study, an attempt was made to develop a spatial soil loss estimation model for Seethagondi cluster using RUSLE coupled with ARCGIS and was used to estimate the soil loss spatially and temporally. The daily rainfall data of Aphrodite for the period from 1951 to 2007 was used, and the annual rainfall varied from 508 to 1351 mm with a mean annual rainfall of 950 mm and a mean erosivity of 6789 MJ mm ha(-1) h(-1) year(-1). Considerable variation in land use land cover especially in crop land and fallow land was observed during normal and drought years, and corresponding variation in the erosivity, C factor, and soil loss was also noted. The mean value of C factor derived from NDVI for crop land was 0.42 and 0.22 in normal year and drought years, respectively. The topography is undulating and major portion of the cluster has slope less than 10°, and 85.3% of the cluster has soil loss below 20 t ha(-1) year(-1). The soil loss from crop land varied from 2.9 to 3.6 t ha(-1) year(-1) in low rainfall years to 31.8 to 34.7 t ha(-1) year(-1) in high rainfall years with a mean annual soil loss of 12.2 t ha(-1) year(-1). The soil loss from crop land was higher in the month of August with an annual soil loss of 13.1 and 2.9 t ha(-1) year(-1) in normal and drought year, respectively. Based on the soil loss in a normal year, the interventions recommended for 85.3% of area of the watershed includes agronomic measures such as contour cultivation, graded bunds, strip cropping, mixed cropping, crop rotations, mulching, summer plowing, vegetative bunds, agri-horticultural system, and management practices such as broad bed furrow, raised sunken beds, and harvesting available water

  8. Spatio-temporal analysis of regional PV generation

    DEFF Research Database (Denmark)

    Nuño Martinez, Edgar; Cutululis, Nicolaos Antonio

    2016-01-01

    Photovoltaic (PV) power is growing in importance worldwide and hence needs to be represented in operation and planning of power system. As opposed to traditional generation technologies, it is characterized by exhibiting both a high variability and a significant spatial dependence. This paper...... presents a fundamental analysis of regional solar generation time series, aiming to potentially facilitate large-scale solar integration. It will focus on characterizing the underlying dependence structure at the system level as well as describing both statistical and temporal properties of regional PV...

  9. Data Clustering

    Science.gov (United States)

    Wagstaff, Kiri L.

    2012-03-01

    On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained

  10. Cosmological analysis of galaxy clusters surveys in X-rays

    International Nuclear Information System (INIS)

    Clerc, N.

    2012-01-01

    Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr

  11. Fuzzy cluster analysis of air quality in Beijing district

    Science.gov (United States)

    Liu, Hongkai

    2018-02-01

    The principle of fuzzy clustering analysis is applied in this article, by using the method of transitive closure, the main air pollutants in 17 districts of Beijing from 2014 to 2016 were classified. The results of the analysis reflects the nearly three year’s changes of the main air pollutants in Beijing. This can provide the scientific for atmospheric governance in the Beijing area and digital support.

  12. [Spatial and temporal clustering characteristics of typhoid and paratyphoid fever and its change pattern in 3 provinces in southwestern China, 2001-2012].

    Science.gov (United States)

    Wang, L X; Yang, B; Yan, M Y; Tang, Y Q; Liu, Z C; Wang, R Q; Li, S; Ma, L; Kan, B

    2017-11-10

    Objective: To analyze the spatial and temporal clustering characteristics of typhoid and paratyphoid fever and its change pattern in Yunnan, Guizhou and Guangxi provinces in southwestern China in recent years. Methods: The incidence data of typhoid and paratyphoid fever cases at county level in 3 provinces during 2001-2012 were collected from China Information System for Diseases Control and Prevention and analyzed by the methods of descriptive epidemiology and geographic informatics. And the map showing the spatial and temporal clustering characters of typhoid and paratyphoid fever cases in three provinces was drawn. SaTScan statistics was used to identify the typhoid and paratyphoid fever clustering areas of three provinces in each year from 2001 to 2012. Results: During the study period, the reported cases of typhoid and paratyphoid fever declined with year. The reported incidence decreased from 30.15 per 100 000 in 2001 to 10.83 per 100 000 in 2006(annual incidence 21.12 per 100 000); while during 2007-2012, the incidence became stable, ranging from 4.75 per 100 000 to 6.83 per 100 000 (annual incidence 5.73 per 100 000). The seasonal variation of the incidence was consistent in three provinces, with majority of cases occurred in summer and autumn. The spatial and temporal clustering of typhoid and paratyphoid fever was demonstrated by the incidence map. Most high-incidence counties were located in a zonal area extending from Yuxi of Yunnan to Guiyang of Guizhou, but were concentrated in Guilin in Guangxi. Temporal and spatial scan statistics identified the positional shifting of class Ⅰ clustering area from Guizhou to Yunnan. Class Ⅰ clustering area was located around the central and western areas (Zunyi and Anshun) of Guizhou during 2001-2003, and moved to the central area of Yunnan during 2004-2012. Conclusion: Spatial and temporal clustering of typhoid and paratyphoid fever existed in the endemic areas of southwestern China, and the clustering area

  13. DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

    Directory of Open Access Journals (Sweden)

    Alexander Chailytko

    2016-05-01

    Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.

  14. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

    Science.gov (United States)

    Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable

  15. Visual Analysis and Processing of Clusters Structures in Multidimensional Datasets

    Science.gov (United States)

    Bondarev, A. E.

    2017-05-01

    The article is devoted to problems of visual analysis of clusters structures for a multidimensional datasets. For visual analyzing an approach of elastic maps design [1,2] is applied. This approach is quite suitable for processing and visualizing of multidimensional datasets. To analyze clusters in original data volume the elastic maps are used as the methods of original data points mapping to enclosed manifolds having less dimensionality. Diminishing the elasticity parameters one can design map surface which approximates the multidimensional dataset in question much better. Then the points of dataset in question are projected to the map. The extension of designed map to a flat plane allows one to get an insight about the cluster structure of multidimensional dataset. The approach of elastic maps does not require any a priori information about data in question and does not depend on data nature, data origin, etc. Elastic maps are usually combined with PCA approach. Being presented in the space based on three first principal components the elastic maps provide quite good results. The article describes the results of elastic maps approach application to visual analysis of clusters for different multidimensional datasets including medical data.

  16. Full text clustering and relationship network analysis of biomedical publications.

    Directory of Open Access Journals (Sweden)

    Renchu Guan

    Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  17. Mobility in Europe: Recent Trends from a Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ioana Manafi

    2017-08-01

    Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.

  18. The Productivity Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, E.

    2014-07-01

    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  19. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    Science.gov (United States)

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-05

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (Pgait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    2009-09-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  1. The Quantitative Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  2. Extended-range forecast for the temporal distribution of clustering tropical cyclogenesis over the western North Pacific

    Science.gov (United States)

    Zhu, Zhiwei; Li, Tim; Bai, Long; Gao, Jianyun

    2017-11-01

    Based on outgoing longwave radiation (OLR), an index for clustering tropical cyclogenesis (CTC) over the western North Pacific (WNP) was defined. Around 76 % of total CTC events were generated during the active phase of the CTC index, and 38 % of the total active phase was concurrent with CTC events. For its continuous property, the CTC index was used as the representative predictand for extended-range forecasting the temporal distribution of CTC events. The predictability sources for CTC events were detected via correlation analyses of the previous 35-5-day lead atmospheric fields against the CTC index. The results showed that the geopotential height at different levels and the 200 hPa zonal wind over the global tropics possessed large predictability sources, whereas the predictability sources of other variables, e.g., OLR, zonal wind, and relatively vorticity at 850 hPa and relatively humility at 700 hPa, were mainly confined to the tropical Indian Ocean and western Pacific Ocean. Several spatial-temporal projection model (STPM) sets were constructed to carry out the extended-range forecast for the CTC index. By combining the output of STPMs separately conducted for the two dominant modes of intraseasonal variability, e.g., the 10-30 and the 30-80 day mode, useful forecast skill could be achieved for a 30-day lead time. The combined output successfully captured both the 10-30 and 30-80 day mode at least 10 days in advance. With a relatively low rate of false alarm, the STPM achieved hits for 80 % (69 %) of 54 CTC events during 2003-2014 at the 10-day (20-day) lead time, suggesting a practical value of the STPM for real-time forecasting WNP CTC events at an extended range.

  3. Virgo cluster and field dwarf ellipticals in 3D - III. Spatially and temporally resolved stellar populations

    NARCIS (Netherlands)

    Ryś, Agnieszka; Koleva, Mina; Falcón-Barroso, Jesús; Vazdekis, Alexandre; Lisker, Thorsten; Peletier, Reynier; van de Ven, Glenn

    2015-01-01

    We present the stellar population analysis of a sample of 12 dwarf elliptical galaxies, observed with the SAURON integral field unit, using the full-spectrum fitting method. We show that star formation histories (SFHs) resolved into two populations can be recovered even within a limited wavelength

  4. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    Science.gov (United States)

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  5. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin

    2014-04-01

    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  6. A two-stage approach to estimate spatial and spatio-temporal disease risks in the presence of local discontinuities and clusters.

    Science.gov (United States)

    Adin, A; Lee, D; Goicoa, T; Ugarte, María Dolores

    2018-01-01

    Disease risk maps for areal unit data are often estimated from Poisson mixed models with local spatial smoothing, for example by incorporating random effects with a conditional autoregressive prior distribution. However, one of the limitations is that local discontinuities in the spatial pattern are not usually modelled, leading to over-smoothing of the risk maps and a masking of clusters of hot/coldspot areas. In this paper, we propose a novel two-stage approach to estimate and map disease risk in the presence of such local discontinuities and clusters. We propose approaches in both spatial and spatio-temporal domains, where for the latter the clusters can either be fixed or allowed to vary over time. In the first stage, we apply an agglomerative hierarchical clustering algorithm to training data to provide sets of potential clusters, and in the second stage, a two-level spatial or spatio-temporal model is applied to each potential cluster configuration. The superiority of the proposed approach with regard to a previous proposal is shown by simulation, and the methodology is applied to two important public health problems in Spain, namely stomach cancer mortality across Spain and brain cancer incidence in the Navarre and Basque Country regions of Spain.

  7. Exploratory analysis of spatial and temporal data a systematic approach

    CERN Document Server

    Andrienko, Natalia

    2006-01-01

    Exploratory data analysis (EDA) is about detecting and describing patterns, trends, and relations in data, motivated by certain purposes of investigation. As something relevant is detected in data, new questions arise, causing specific parts to be viewed in more detail. So EDA has a significant appeal: it involves hypothesis generation rather than mere hypothesis testing. The authors describe in detail and systemize approaches, techniques, and methods for exploring spatial and temporal data in particular. They start by developing a general view of data structures and characteristics and then build on top of this a general task typology, distinguishing between elementary and synoptic tasks. This typology is then applied to the description of existing approaches and technologies, resulting not just in recommendations for choosing methods but in a set of generic procedures for data exploration. Professionals practicing analysis will profit from tested solutions - illustrated in many examples - for reuse in the c...

  8. Geographic information system-based analysis of the spatial and spatio-temporal distribution of zoonotic cutaneous leishmaniasis in Golestan Province, north-east of Iran.

    Science.gov (United States)

    Mollalo, A; Alimohammadi, A; Shirzadi, M R; Malek, M R

    2015-02-01

    Zoonotic cutaneous leishmaniasis (ZCL), a vector-borne disease, poses serious psychological as well as social and economic burden to many rural areas of Iran. The main objectives of this study were to analyse yearly spatial distribution and the possible spatial and spatio-temporal clusters of the disease to better understand spatio-temporal epidemiological aspects of ZCL in rural areas of an endemic province, located in north-east of Iran. Cross-sectional survey was performed on 2983 recorded cases during the period of 2010-2012 at village level throughout the study area. Global clustering methods including the average nearest-neighbour distance, Moran's I, general G indices and Ripley's K-function were applied to investigate the annual spatial distribution of the existing point patterns. Presence of spatial and spatio-temporal clusters was investigated using the spatial and space-time scan statistics. For each year, semivariogram analysis and all global clustering methods indicated meaningful persistent spatial autocorrelation and highly clustered distribution of ZCL, respectively. Eight significant spatial clusters, mainly located in north and northeast of the province, and one space-time cluster, observed in northern part of the province and during the period of September 2010-November 2010, were detected. Comparison of the location of ZCL clusters with environmental conditions of the study area showed that 97.8% of cases in clusters were located at low altitudes below 725 m above sea level with predominantly arid and semi-arid climates and poor socio-economic conditions. The identified clusters highlight high-risk areas requiring special plans and resources for more close monitoring and control of the disease. © 2014 Blackwell Verlag GmbH.

  9. Spatio-temporal analysis of blood perfusion by imaging photoplethysmography

    Science.gov (United States)

    Zaunseder, Sebastian; Trumpp, Alexander; Ernst, Hannes; Förster, Michael; Malberg, Hagen

    2018-02-01

    Imaging photoplethysmography (iPPG) has attracted much attention over the last years. The vast majority of works focuses on methods to reliably extract the heart rate from videos. Only a few works addressed iPPGs ability to exploit spatio-temporal perfusion pattern to derive further diagnostic statements. This work directs at the spatio-temporal analysis of blood perfusion from videos. We present a novel algorithm that bases on the two-dimensional representation of the blood pulsation (perfusion map). The basic idea behind the proposed algorithm consists of a pairwise estimation of time delays between photoplethysmographic signals of spatially separated regions. The probabilistic approach yields a parameter denoted as perfusion speed. We compare the perfusion speed versus two parameters, which assess the strength of blood pulsation (perfusion strength and signal to noise ratio). Preliminary results using video data with different physiological stimuli (cold pressure test, cold face test) show that all measures are influenced by those stimuli (some of them with statistical certainty). The perfusion speed turned out to be more sensitive than the other measures in some cases. However, our results also show that the intraindividual stability and interindividual comparability of all used measures remain critical points. This work proves the general feasibility of employing the perfusion speed as novel iPPG quantity. Future studies will address open points like the handling of ballistocardiographic effects and will try to deepen the understanding of the predominant physiological mechanisms and their relation to the algorithmic performance.

  10. Temporal analysis of text data using latent variable models

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Larsen, Jan; Goutte, Cyril

    2009-01-01

    Detecting and tracking of temporal data is an important task in multiple applications. In this paper we study temporal text mining methods for Music Information Retrieval. We compare two ways of detecting the temporal latent semantics of a corpus extracted from Wikipedia, using a stepwise...

  11. Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio.

    Science.gov (United States)

    Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh

    2012-01-01

    Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to

  12. Performance Based Clustering for Benchmarking of Container Ports: an Application of Dea and Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Jie Wu

    2010-12-01

    Full Text Available The operational performance of container ports has received more and more attentions in both academic and practitioner circles, the performance evaluation and process improvement of container ports have also been the focus of several studies. In this paper, Data Envelopment Analysis (DEA, an effective tool for relative efficiency assessment, is utilized for measuring the performances and benchmarking of the 77 world container ports in 2007. The used approaches in the current study consider four inputs (Capacity of Cargo Handling Machines, Number of Berths, Terminal Area and Storage Capacity and a single output (Container Throughput. The results for the efficiency scores are analyzed, and a unique ordering of the ports based on average cross efficiency is provided, also cluster analysis technique is used to select the more appropriate targets for poorly performing ports to use as benchmarks.

  13. Pharmacokinetic analysis and k-means clustering of DCEMR images for radiotherapy outcome prediction of advanced cervical cancers.

    Science.gov (United States)

    Andersen, Erlend K F; Kristensen, Gunnar B; Lyng, Heidi; Malinen, Eirik

    2011-08-01

    Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K(trans) and ύ(e), were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control.

  14. Pharmacokinetic analysis and k-means clustering of DCEMR images for radiotherapy outcome prediction of advanced cervical cancers

    International Nuclear Information System (INIS)

    Andersen, Erlend K. F.; Kristensen, Gunnar B.; Lyng, Heidi; Malinen, Eirik

    2011-01-01

    Introduction. Pharmacokinetic analysis of dynamic contrast enhanced magnetic resonance images (DCEMRI) allows for quantitative characterization of vascular properties of tumors. The aim of this study is twofold, first to determine if tumor regions with similar vascularization could be labeled by clustering methods, second to determine if the identified regions can be associated with local cancer relapse. Materials and methods. Eighty-one patients with locally advanced cervical cancer treated with chemoradiotherapy underwent DCEMRI with Gd-DTPA prior to external beam radiotherapy. The median follow-up time after treatment was four years, in which nine patients had primary tumor relapse. By fitting a pharmacokinetic two-compartment model function to the temporal contrast enhancement in the tumor, two pharmacokinetic parameters, K trans and u e , were estimated voxel by voxel from the DCEMR-images. Intratumoral regions with similar vascularization were identified by k-means clustering of the two pharmacokinetic parameter estimates over all patients. The volume fraction of each cluster was used to evaluate the prognostic value of the clusters. Results. Three clusters provided a sufficient reduction of the cluster variance to label different vascular properties within the tumors. The corresponding median volume fraction of each cluster was 38%, 46% and 10%. The second cluster was significantly associated with primary tumor control in a log-rank survival test (p-value: 0.042), showing a decreased risk of treatment failure for patients with high volume fraction of voxels. Conclusions. Intratumoral regions showing similar vascular properties could successfully be labeled in three distinct clusters and the volume fraction of one cluster region was associated with primary tumor control

  15. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

    Science.gov (United States)

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  16. Spatial and temporal analysis of dry spells in Greece

    Science.gov (United States)

    Anagnostopoulou, Chr.; Maheras, P.; Karacostas, T.; Vafiadis, M.

    A spatio-temporal analysis of the dry spells that occur in the Greek area is carried out for an extended period of 40 years (1958-1997). The dry spells can be defined as a number of consecutive days with no rain. The number of days defines the length of the dry spells. The longest spells are identified in central (Cyclades) and the south-east Aegean Sea whereas dry spells with the minimum length are shown over the north-west of the Greek area that reflects the significance of the latitude and the topography. Negative Binomial Distribution and Markov Chains of second order have been used to fit the duration of the dry spells of different lengths. The study of the seasonal and annual distribution of the frequency of occurrence of dry spells revealed that the dry spells in Greece depict a seasonal character, while medium and long sequences are associated with the duration and hazards of drought.

  17. Second-order analysis of structured inhomogeneous spatio-temporal point processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Ghorbani, Mohammad

    Statistical methodology for spatio-temporal point processes is in its infancy. We consider second-order analysis based on pair correlation functions and K-functions for first general inhomogeneous spatio-temporal point processes and second inhomogeneous spatio-temporal Cox processes. Assuming...... spatio-temporal separability of the intensity function, we clarify different meanings of second-order spatio-temporal separability. One is second-order spatio-temporal independence and relates e.g. to log-Gaussian Cox processes with an additive covariance structure of the underlying spatio......-temporal Gaussian process. Another concerns shot-noise Cox processes with a separable spatio-temporal covariance density. We propose diagnostic procedures for checking hypotheses of second-order spatio-temporal separability, which we apply on simulated and real data (the UK 2001 epidemic foot and mouth disease data)....

  18. Aspects of second-order analysis of structured inhomogeneous spatio-temporal processes

    DEFF Research Database (Denmark)

    Møller, Jesper; Ghorbani, Mohammad

    2012-01-01

    Statistical methodology for spatio-temporal point processes is in its infancy. We consider second-order analysis based on pair correlation functions and K-functions for general inhomogeneous spatio-temporal point processes and for inhomogeneous spatio-temporal Cox processes. Assuming spatio......-temporal separability of the intensity function, we clarify different meanings of second-order spatio-temporal separability. One is second-order spatio-temporal independence and relates to log-Gaussian Cox processes with an additive covariance structure of the underlying spatio-temporal Gaussian process. Another...... concerns shot-noise Cox processes with a separable spatio-temporal covariance density. We propose diagnostic procedures for checking hypotheses of second-order spatio-temporal separability, which we apply on simulated and real data....

  19. Temporal features of word-initial /s/+stop clusters in bilingual Mandarin-English children and monolingual English children and adults.

    Science.gov (United States)

    Yang, Jing

    2018-03-01

    This study investigated the durational features of English word-initial /s/+stop clusters produced by bilingual Mandarin (L1)-English (L2) children and monolingual English children and adults. The participants included two groups of five- to six-year-old bilingual children: low proficiency in the L2 (Bi-low) and high proficiency in the L2 (Bi-high), one group of age-matched English children, and one group of English adults. Each participant produced a list of English words containing /sp, st, sk/ at the word-initial position followed by /a, i, u/, respectively. The absolute durations of the clusters and cluster elements and the durational proportions of elements to the overall cluster were measured. The results revealed that Bi-high children behaved similarly to the English monolinguals whereas Bi-low children used a different strategy of temporal organization to coordinate the cluster components in comparison to the English monolinguals and Bi-high children. The influence of language experience and continuing development of temporal features in children were discussed.

  20. Identification and characterization of earthquake clusters: a comparative analysis for selected sequences in Italy

    Science.gov (United States)

    Peresan, Antonella; Gentili, Stefania

    2017-04-01

    Identification and statistical characterization of seismic clusters may provide useful insights about the features of seismic energy release and their relation to physical properties of the crust within a given region. Moreover, a number of studies based on spatio-temporal analysis of main-shocks occurrence require preliminary declustering of the earthquake catalogs. Since various methods, relying on different physical/statistical assumptions, may lead to diverse classifications of earthquakes into main events and related events, we aim to investigate the classification differences among different declustering techniques. Accordingly, a formal selection and comparative analysis of earthquake clusters is carried out for the most relevant earthquakes in North-Eastern Italy, as reported in the local OGS-CRS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. The comparison is then extended to selected earthquake sequences associated with a different seismotectonic setting, namely to events that occurred in the region struck by the recent Central Italy destructive earthquakes, making use of INGV data. Various techniques, ranging from classical space-time windows methods to ad hoc manual identification of aftershocks, are applied for detection of earthquake clusters. In particular, a statistical method based on nearest-neighbor distances of events in space-time-energy domain, is considered. Results from clusters identification by the nearest-neighbor method turn out quite robust with respect to the time span of the input catalogue, as well as to minimum magnitude cutoff. The identified clusters for the largest events reported in North-Eastern Italy since 1977 are well consistent with those reported in earlier studies, which were aimed at detailed manual aftershocks identification. The study shows that the data-driven approach, based on the nearest-neighbor distances, can be satisfactorily applied to decompose the seismic

  1. Quantile regression and clustering analysis of standardized precipitation index in the Tarim River Basin, Xinjiang, China

    Science.gov (United States)

    Yang, Peng; Xia, Jun; Zhang, Yongyong; Han, Jian; Wu, Xia

    2017-11-01

    Because drought is a very common and widespread natural disaster, it has attracted a great deal of academic interest. Based on 12-month time scale standardized precipitation indices (SPI12) calculated from precipitation data recorded between 1960 and 2015 at 22 weather stations in the Tarim River Basin (TRB), this study aims to identify the trends of SPI and drought duration, severity, and frequency at various quantiles and to perform cluster analysis of drought events in the TRB. The results indicated that (1) both precipitation and temperature at most stations in the TRB exhibited significant positive trends during 1960-2015; (2) multiple scales of SPIs changed significantly around 1986; (3) based on quantile regression analysis of temporal drought changes, the positive SPI slopes indicated less severe and less frequent droughts at lower quantiles, but clear variation was detected in the drought frequency; and (4) significantly different trends were found in drought frequency probably between severe droughts and drought frequency.

  2. Fractal Segmentation and Clustering Analysis for Seismic Time Slices

    Science.gov (United States)

    Ronquillo, G.; Oleschko, K.; Korvin, G.; Arizabalo, R. D.

    2002-05-01

    Fractal analysis has become part of the standard approach for quantifying texture on gray-tone or colored images. In this research we introduce a multi-stage fractal procedure to segment, classify and measure the clustering patterns on seismic time slices from a 3-D seismic survey. Five fractal classifiers (c1)-(c5) were designed to yield standardized, unbiased and precise measures of the clustering of seismic signals. The classifiers were tested on seismic time slices from the AKAL field, Cantarell Oil Complex, Mexico. The generalized lacunarity (c1), fractal signature (c2), heterogeneity (c3), rugosity of boundaries (c4) and continuity resp. tortuosity (c5) of the clusters are shown to be efficient measures of the time-space variability of seismic signals. The Local Fractal Analysis (LFA) of time slices has proved to be a powerful edge detection filter to detect and enhance linear features, like faults or buried meandering rivers. The local fractal dimensions of the time slices were also compared with the self-affinity dimensions of the corresponding parts of porosity-logs. It is speculated that the spectral dimension of the negative-amplitude parts of the time-slice yields a measure of connectivity between the formation's high-porosity zones, and correlates with overall permeability.

  3. Cluster analysis for DNA methylation profiles having a detection threshold

    Directory of Open Access Journals (Sweden)

    Siegmund Kimberly D

    2006-07-01

    Full Text Available Abstract Background DNA methylation, a molecular feature used to investigate tumor heterogeneity, can be measured on many genomic regions using the MethyLight technology. Due to the combination of the underlying biology of DNA methylation and the MethyLight technology, the measurements, while being generated on a continuous scale, have a large number of 0 values. This suggests that conventional clustering methodology may not perform well on this data. Results We compare performance of existing methodology (such as k-means with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation. Conclusion The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.

  4. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-11-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  5. Monitoring Customer Satisfaction in Service Industry: A Cluster Analysis Approach

    Directory of Open Access Journals (Sweden)

    Matúš Horváth

    2012-10-01

    Full Text Available One of the key performance indicators of quality management system of an organization is customer satisfaction. The process of monitoring customer satisfaction is therefore an important part of the measuring processes of the quality management system. This paper deals with new ways how to analyse and monitor customer satisfaction using the analysis of data containing how the customers use the organisation services and customer leaving rates. The article used cluster analysis in this process for segmentation of customers with the aim to increase the accuracy of the results and on these results based decisions. The aplication example was created as a part of bachelor thesis.

  6. Market analysis of Serbia's raspberry sector and cluster development initiatives

    Directory of Open Access Journals (Sweden)

    Paraušić Vesna

    2016-01-01

    Full Text Available Authors analyze competitive strength and weakness of raspberry producers in Serbia and propose key prerequisites of which fulfilling will depend develop of successful cluster initiative in Serbian raspberry sector. The research results indicate that Serbian raspberry growers can develop successful cluster and they can keep leading position in the global market of raspberries, only with following many assumptions, like: (a better organized marketing channel through the vertically and horizontal integration of all actors in this sector,(b strengthening specialized cooperatives for raspberry production and associations of raspberry growers, and in the future setting up of producer organizations and associations; (c inclusion of producers of other berries and producers of processed berries; (d introducing innovations, scientific knowledge, and research and development in production, processing, packing, logistics, export of raspberries, etc. An analysis is based on case study in Šumadija and Western Serbia region, which is major region in raspberry production in Serbia.

  7. Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis

    Science.gov (United States)

    Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song

    2018-01-01

    To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.

  8. k-t PCA: temporally constrained k-t BLAST reconstruction using principal component analysis

    DEFF Research Database (Denmark)

    Pedersen, Henrik; Kozerke, Sebastian; Ringgaard, Steffen

    2009-01-01

    in applications exhibiting a broad range of temporal frequencies such as free-breathing myocardial perfusion imaging. We show that temporal basis functions calculated by subjecting the training data to principal component analysis (PCA) can be used to constrain the reconstruction such that the temporal resolution...... is improved. The presented method is called k-t PCA....

  9. Spatial-Temporal Analysis of the Economic and Environmental Coordination Development Degree in Liaoning Province

    Directory of Open Access Journals (Sweden)

    Hui Wang

    2013-01-01

    Full Text Available This study selects 20 indices of economic and environmental conditions over 15 years (1996–2010 for 14 cities in Liaoning province, China. We calculate the economic score and environmental score of each city by processing 4200 data points through SPSS 16.0 and establish synthesis functions between the economy and the environment. For the time dimension, we study the temporal evolution of the economic and environmental coordination development degree . Based on Exploratory Spatial Data Analysis (ESDA techniques and using GeoDa, we calculate Moran's index of local spatial autocorrelation and explore the spatial distribution character of in Liaoning province through a LISA cluster map. As we found in the temporal dimension, the results show that of the 14 cities has been rising for 15 years and that increases year by year, which indicates that the economic and environmental coordination development condition has been improving from disorder to highly coordinated. A smaller gap between economic strength and environmental carrying capacity in Liaoning province exists, which means that economic development and environmental protection remain synchronized. In the spatial dimension, the highly coordinated cities have changed from a scattering to a concentration in the middle-south region of Liaoning province. Poorly coordinated cities are scattered in the northwestern region of Liaoning province.

  10. Steady state subchannel analysis of AHWR fuel cluster

    International Nuclear Information System (INIS)

    Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.

    2006-09-01

    Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)

  11. A new method of spatio-temporal topographic mapping by correlation coefficient of K-means cluster.

    Science.gov (United States)

    Li, Ling; Yao, Dezhong

    2007-01-01

    It would be of the utmost interest to map correlated sources in the working human brain by Event-Related Potentials (ERPs). This work is to develop a new method to map correlated neural sources based on the time courses of the scalp ERPs waveforms. The ERP data are classified first by k-means cluster analysis, and then the Correlation Coefficients (CC) between the original data of each electrode channel and the time course of each cluster centroid are calculated and utilized as the mapping variable on the scalp surface. With a normalized 4-concentric-sphere head model with radius 1, the performance of the method is evaluated by simulated data. CC, between simulated four sources (s (1)-s (4)) and the estimated cluster centroids (c (1)-c (4)), and the distances (Ds), between the scalp projection points of the s (1)-s (4) and that of the c (1)-c (4), are utilized as the evaluation indexes. Applied to four sources with two of them partially correlated (with maximum mutual CC = 0.4892), CC (Ds) between s (1)-s (4) and c (1)-c (4) are larger (smaller) than 0.893 (0.108) for noise levels NSRclusters located at left, right occipital and frontal. The estimated vectors of the contra-occipital area demonstrate that attention to the stimulus location produces increased amplitude of the P1 and N1 components over the contra-occipital scalp. The estimated vector in the frontal area displays two large processing negativity waves around 100 ms and 250 ms when subjects are attentive, and there is a small negative wave around 140 ms and a P300 when subjects are unattentive. The results of simulations and real Visual Evoked Potentials (VEPs) data demonstrate the validity of the method in mapping correlated sources. This method may be an objective, heuristic and important tool to study the properties of cerebral, neural networks in cognitive and clinical neurosciences.

  12. [The hierarchical clustering analysis of hyperspectral image based on probabilistic latent semantic analysis].

    Science.gov (United States)

    Yi, Wen-Bin; Shen, Li; Qi, Yin-Feng; Tang, Hong

    2011-09-01

    The paper introduces the Probabilistic Latent Semantic Analysis (PLSA) to the image clustering and an effective image clustering algorithm using the semantic information from PLSA is proposed which is used for hyperspectral images. Firstly, the ISODATA algorithm is used to obtain the initial clustering result of hyperspectral image and the clusters of the initial clustering result are considered as the visual words of the PLSA. Secondly, the object-oriented image segmentation algorithm is used to partition the hyperspectral image and segments with relatively pure pixels are regarded as documents in PLSA. Thirdly, a variety of identification methods which can estimate the best number of cluster centers is combined to get the number of latent semantic topics. Then the conditional distributions of visual words in topics and the mixtures of topics in different documents are estimated by using PLSA. Finally, the conditional probabilistic of latent semantic topics are distinguished using statistical pattern recognition method, the topic type for each visual in each document will be given and the clustering result of hyperspectral image are then achieved. Experimental results show the clusters of the proposed algorithm are better than K-MEANS and ISODATA in terms of object-oriented property and the clustering result is closer to the distribution of real spatial distribution of surface.

  13. Segmentation of Residential Gas Consumers Using Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Marta P. Fernandes

    2017-12-01

    Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.

  14. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  15. Analysis of Learning Development With Sugeno Fuzzy Logic And Clustering

    Directory of Open Access Journals (Sweden)

    Maulana Erwin Saputra

    2017-06-01

    Full Text Available In the first journal, I made this attempt to analyze things that affect the achievement of students in each school of course vary. Because students are one of the goals of achieving the goals of successful educational organizations. The mental influence of students’ emotions and behaviors themselves in relation to learning performance. Fuzzy logic can be used in various fields as well as Clustering for grouping, as in Learning Development analyzes. The process will be performed on students based on the symptoms that exist. In this research will use fuzzy logic and clustering. Fuzzy is an uncertain logic but its excess is capable in the process of language reasoning so that in its design is not required complicated mathematical equations. However Clustering method is K-Means method is method where data analysis is broken down by group k (k = 1,2,3, .. k. To know the optimal number of Performance group. The results of the research is with a questionnaire entered into matlab will produce a value that means in generating the graph. And simplify the school in seeing Student performance in the learning process by using certain criteria. So from the system that obtained the results for a decision-making required by the school.

  16. Visualizing dynamical neural assemblies with a fuzzy synchronization clustering analysis.

    Science.gov (United States)

    Zhou, Shu; Wu, Yan; Dos Santos, Claudia C

    2009-12-01

    Phase synchrony has been proposed as a possible communication mechanism between cerebral regions. The participation index method (PIM) may be used to investigate integrating structures within an oscillatory network, based on the eigenvalue decomposition of matrix of bivariate synchronization indices. However, eigenvector orthogonality between clusters may result in categorization difficulties for hub oscillators and pseudoclustering phenomenon. Here, we propose a method of fuzzy synchronization clustering analysis (FSCA) to avoid the constraint of orthogonality by combining the fuzzy c-means algorithm with the phase-locking value. Following mathematical derivation, we cross-validated the FSCA and the PIM using the same multichannel phase time series of event-related EEG from a subject performing a working memory task. Both clustering methods produced consistent findings for the qualitatively salient configuration of the original network-illustrated here by a visualization technique. In contrast to PIM, use of common virtual oscillatory centroids enabled the FSCA to reveal multiple dynamical neural assemblies as well as the unitary phase information within each assembly.

  17. Clustering the lexicon in the brain: a meta-analysis of the neurofunctional evidence on noun and verb processing.

    Science.gov (United States)

    Crepaldi, Davide; Berlingeri, Manuela; Cattinelli, Isabella; Borghese, Nunzio A; Luzzatti, Claudio; Paulesu, Eraldo

    2013-01-01

    Although it is widely accepted that nouns and verbs are functionally independent linguistic entities, it is less clear whether their processing recruits different brain areas. This issue is particularly relevant for those theories of lexical semantics (and, more in general, of cognition) that suggest the embodiment of abstract concepts, i.e., based strongly on perceptual and motoric representations. This paper presents a formal meta-analysis of the neuroimaging evidence on noun and verb processing in order to address this dichotomy more effectively at the anatomical level. We used a hierarchical clustering algorithm that grouped fMRI/PET activation peaks solely on the basis of spatial proximity. Cluster specificity for grammatical class was then tested on the basis of the noun-verb distribution of the activation peaks included in each cluster. Thirty-two clusters were identified: three were associated with nouns across different tasks (in the right inferior temporal gyrus, the left angular gyrus, and the left inferior parietal gyrus); one with verbs across different tasks (in the posterior part of the right middle temporal gyrus); and three showed verb specificity in some tasks and noun specificity in others (in the left and right inferior frontal gyrus and the left insula). These results do not support the popular tenets that verb processing is predominantly based in the left frontal cortex and noun processing relies specifically on temporal regions; nor do they support the idea that verb lexical-semantic representations are heavily based on embodied motoric information. Our findings suggest instead that the cerebral circuits deputed to noun and verb processing lie in close spatial proximity in a wide network including frontal, parietal, and temporal regions. The data also indicate a predominant-but not exclusive-left lateralization of the network.

  18. Persistent solar signatures in cloud cover: spatial and temporal analysis

    International Nuclear Information System (INIS)

    Voiculescu, M; Usoskin, I

    2012-01-01

    A consensus regarding the impact of solar variability on cloud cover is far from being reached. Moreover, the impact of cloud cover on climate is among the least understood of all climate components. This motivated us to analyze the persistence of solar signals in cloud cover for the time interval 1984–2009, covering two full solar cycles. A spatial and temporal investigation of the response of low, middle and high cloud data to cosmic ray induced ionization (CRII) and UV irradiance (UVI) is performed in terms of coherence analysis of the two signals. For some key geographical regions the response of clouds to UVI and CRII is persistent over the entire time interval indicating a real link. In other regions, however, the relation is not consistent, being intermittent or out of phase, suggesting that some correlations are spurious. The constant in phase or anti-phase relationship between clouds and solar proxies over some regions, especially for low clouds with UVI and CRII, middle clouds with UVI and high clouds with CRII, definitely requires more study. Our results show that solar signatures in cloud cover persist in some key climate-defining regions for the entire time period and supports the idea that, if existing, solar effects are not visible at the global level and any analysis of solar effects on cloud cover (and, consequently, on climate) should be done at the regional level. (letter)

  19. Using cluster analysis to examine dietary patterns: nutrient intakes, gender, and weight status differ across food pattern clusters.

    Science.gov (United States)

    Wirfält, A K; Jeffery, R W

    1997-03-01

    This study explored the usefulness of cluster analysis in identifying food choice patterns of three groups of adults in relation to their energy intake. Food frequency data were converted to percentage of total energy from 38 food groups and entered into a cluster analysis procedure. Subjects in the emerging food group patterns were compared in terms of weight status, demographics, and the nutrition composition of their usual diet. Data were collected as part of three studies in two US metropolitan areas using identical protocols. Participants were university employees (103 women and 99 men) who volunteered for a reliability study of health behavior questionnaires and moderately obese volunteers (223 women and 101 men) to two weight-loss studies who were recruited by newspaper advertisements. Subjects were clustered according to food energy sources using the FASTCLUS procedure in the Statistical Analysis System. One-way analysis of variance and chi 2 analysis were then performed to compared the weight status, nutrient intakes, and demographics of the food patterns. Six food pattern clusters were identified. Subjects in the two clusters associated with high consumption of pastry and meat had significantly higher fat intakes (P = .0001). Subjects in two other clusters, those associated with high intake of skim milk and a broad distribution of energy sources had significantly higher micronutrient levels (P = .0001). Body mass index and the distribution of gender were also significantly different across clusters. The success of cluster analysis in identifying dietary exposure categories with unique demographic and nutritional correlates suggests that the approach may be useful in epidemiologic studies that examine conditions such as obesity, and in the design of nutrition interventions.

  20. Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

    Science.gov (United States)

    Muraoka, Masae; Okuda, Hiroshi

    With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.

  1. Compositional Temporal Analysis Model for Incremental Hard Real-Time System Design

    NARCIS (Netherlands)

    Hausmans, J.P.H.M.; Geuns, S.J.; Wiggers, M.H.; Bekooij, Marco Jan Gerrit

    2012-01-01

    The incremental design and analysis of parallel hard real-time stream processing applications is hampered by the lack of an intuitive compositional temporal analysis model that supports arbitrary cyclic dependencies between tasks. This paper introduces a temporal analysis model for hard real-time

  2. Temporal Expectation and Information Processing: A Model-Based Analysis

    Science.gov (United States)

    Jepma, Marieke; Wagenmakers, Eric-Jan; Nieuwenhuis, Sander

    2012-01-01

    People are able to use temporal cues to anticipate the timing of an event, enabling them to process that event more efficiently. We conducted two experiments, using the fixed-foreperiod paradigm (Experiment 1) and the temporal-cueing paradigm (Experiment 2), to assess which components of information processing are speeded when subjects use such…

  3. SPATIO-TEMPORAL CLUSTERING OF MOVEMENT DATA: AN APPLICATION TO TRAJECTORIES GENERATED BY HUMAN-COMPUTER INTERACTION

    Directory of Open Access Journals (Sweden)

    G. McArdle

    2012-07-01

    Full Text Available Advances in ubiquitous positioning technologies and their increasing availability in mobile devices has generated large volumes of movement data. Analysing these datasets is challenging. While data mining techniques can be applied to this data, knowledge of the underlying spatial region can assist interpreting the data. We have developed a geovisual analysis tool for studying movement data. In addition to interactive visualisations, the tool has features for analysing movement trajectories, in terms of their spatial and temporal similarity. The focus in this paper is on mouse trajectories of users interacting with web maps. The results obtained from a user trial can be used as a starting point to determine which parts of a mouse trajectory can assist personalisation of spatial web maps.

  4. Minimum Information Loss Cluster Analysis for Cathegorical Data

    Czech Academy of Sciences Publication Activity Database

    Grim, Jiří; Hora, Jan

    2007-01-01

    Roč. 2007, Č. 4571 (2007), s. 233-247 ISSN 0302-9743. [International Conference on Machine Learning and Data Mining MLDM 2007 /5./. Leipzig, 18.07.2007-20.07.2007] R&D Projects: GA MŠk 1M0572; GA ČR GA102/07/1594 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Cluster Analysis * Cathegorical Data * EM algorithm Subject RIV: BD - The ory of Information Impact factor: 0.402, year: 2005

  5. Analysis of clusterization and networking processes in developing intermodal transportation

    Directory of Open Access Journals (Sweden)

    Sinkevičius Gintaras

    2016-06-01

    Full Text Available Analysis of the processes of clusterization and networking draws attention to the necessity of integration of railway transport into the intermodal or multimodal transport chain. One of the most widespread methods of combined transport is interoperability of railway and road transport. The objective is to create an uninterrupted transport chain in combining several modes of transport. The aim of this is to save energy resources, to form an effective, competitive, attractive to the client and safe and environmentally friendly transport system.

  6. A cluster analysis on road traffic accidents using genetic algorithms

    Science.gov (United States)

    Saharan, Sabariah; Baragona, Roberto

    2017-04-01

    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  7. Transcriptomic analysis of the temporal host response to skin infestation with the ectoparasitic mite Psoroptes ovis

    Directory of Open Access Journals (Sweden)

    McNeilly Tom N

    2010-11-01

    Full Text Available Abstract Background Infestation of ovine skin with the ectoparasitic mite Psoroptes ovis results in a rapid cutaneous immune response, leading to the crusted skin lesions characteristic of sheep scab. Little is known regarding the mechanisms by which such a profound inflammatory response is instigated and to identify novel vaccine and drug targets a better understanding of the host-parasite relationship is essential. The main objective of this study was to perform a combined network and pathway analysis of the in vivo skin response to infestation with P. ovis to gain a clearer understanding of the mechanisms and signalling pathways involved. Results Infestation with P. ovis resulted in differential expression of 1,552 genes over a 24 hour time course. Clustering by peak gene expression enabled classification of genes into temporally related groupings. Network and pathway analysis of clusters identified key signalling pathways involved in the host response to infestation. The analysis implicated a number of genes with roles in allergy and inflammation, including pro-inflammatory cytokines (IL1A, IL1B, IL6, IL8 and TNF and factors involved in immune cell activation and recruitment (SELE, SELL, SELP, ICAM1, CSF2, CSF3, CCL2 and CXCL2. The analysis also highlighted the influence of the transcription factors NF-kB and AP-1 in the early pro-inflammatory response, and demonstrated a bias towards a Th2 type immune response. Conclusions This study has provided novel insights into the signalling mechanisms leading to the development of a pro-inflammatory response in sheep scab, whilst providing crucial information regarding the nature of mite factors that may trigger this response. It has enabled the elucidation of the temporal patterns by which the immune system is regulated following exposure to P. ovis, providing novel insights into the mechanisms underlying lesion development. This study has improved our existing knowledge of the host response to P

  8. A Spatial and Temporal Analysis of Upward Triggered Lightning

    Science.gov (United States)

    Ballweber, A. J.

    2013-12-01

    Alana Ballweber, John H. Helsdon Jr., and Tom A. Warner South Dakota School of Mines and Technology Ten tall communication towers lining the ridge in Rapid City, South Dakota provide a unique opportunity to study the phenomenon of lightning-triggered upward lightning. The Upward Lightning Triggering Study (UPLIGHTS), seeks to determine if upward positive leaders are triggered from these towers by: (1) the approach of horizontally propagating negative stepped leaders associated with either intracloud development or following a positive cloud-to-ground (+CG) return stroke, and/or (2) a +CG return stroke as it propagates through a previously formed leader network near the towers. As part of the UPLIGHTS research, two separate lightning mapping devices were used to aid in a 3D re-creation of the triggering flash, a 3D digital interferometer and a Lightning Mapping Array. Through the use of these two devices, we present findings founded on the analysis of data collected from these assets during the 2013 storm season. Specifically, we quantify the spatial and temporal relationship of the triggering flash leader activity relative to the tall objects when upward leaders develop and when upward leaders fail to develop. Furthermore, the lightning mapping devices were correlated with high-speed optical and electrical field observations to provide a further insight as to why certain flashes trigger upward lightning from tall structures and others do not.

  9. Spatio-Temporal Human Grip Force Analysis via Sensor Arrays

    Directory of Open Access Journals (Sweden)

    Florian P. Kolb

    2009-08-01

    Full Text Available This study describes a technique for measuring human grip forces exerted on a cylindrical object via a sensor array. Standardised resistor-based pressure sensor arrays for industrial and medical applications have been available for some time. We used a special 20 mm diameter grip rod that subjects could either move actively with their fingers in the horizontal direction or exert reactive forces against opposing forces generated in the rod by a linear motor. The sensor array film was attached to the rod by adhesive tape and covered approximately 45 cm2 of the rod surface. The sensor density was 4/cm2 with each sensor having a force resolution of 0.1 N. A scan across all sensors resulted in a corresponding frame containing force values at a frame repetition rate of 150/s. The force value of a given sensor was interpreted as a pixel value resulting in a false-colour image. Based on remote sensed image analysis an algorithm was developed to distinguish significant force-representing pixels from those affected by noise. This allowed tracking of the position of identified fingers in subsequent frames such that spatio-temporal grip force profiles for individual fingers could be derived. Moreover, the algorithm allowed simultaneous measurement of forces exerted without any constraints on the number of fingers or on the position of the fingers. The system is thus well suited for basic and clinical research in human physiology as well as for studies in psychophysics.

  10. Multivariate analysis of spatial-temporal scales in melanoma prevalence.

    Science.gov (United States)

    Valachovic, Edward; Zurbenko, Igor

    2017-07-01

    Melanoma is a particularly deadly form of skin cancer arising from diverse biological and physical origins, making the characterization and quantification of relationships with recognized risk factors very complex. Melanoma has known associations with ultraviolet light exposure. Natural variations in solar electromagnetic irradiation, length of exposure, and intensity operate on different and therefore uncorrelated time scale frequencies. It is necessary to separate and investigate the principal components, such as the annual and solar cycle components, free from confounding influences. Kolmogorov-Zurbenko spatial filters applied to melanoma prevalence and environmental factors affecting solar irradiation exposure are able to identify and separate the independent space and time scale components of melanoma. Multidimensional analysis in space and time produces significantly improved model fit of what is in effect a linear regression of maps, or motion picture, in different time scales between melanoma rates and prominent factors. The resulting multivariate model coefficients of influence for each unique spatial-temporal melanoma component help quantify the relationships and are valuable to future research and prevention.

  11. MULTI-TEMPORAL ANALYSIS OF WWII RECONNAISSANCE PHOTOS

    Directory of Open Access Journals (Sweden)

    P. Meixner

    2016-06-01

    Full Text Available There are millions of aerial photographs from the period of the Second Wold War available in the Allied archives, obtained by aerial photo reconnaissance, covering most of today’s European countries. They are spanning the time from 1938 until the end of the war and even beyond. Photo reconnaissance provided intelligence information for the Allied headquarters and accompanied the bombing offensive against the German homeland and the occupied territories. One of the initial principal targets in Bohemia were the synthetized fuel works STW AG (Sudetenländische Treibstoffwerke AG in Zaluzi (formerly Maltheuren near Most (formerly Brück, Czech Republic. The STW AG synthetized fuel plant was not only subject to bombing raids, but a subject to quite intensive photo reconnaissance, too - long before the start of the bombing campaign. With a multi-temporal analysis of the available imagery from international archives we will demonstrate the factory build-up during 1942 and 1943, the effects of the bombing raids in 1944 and the struggle to keep the plant working in the last year of the war. Furthermore we would like to show the impact the bombings have today, in form of potential unexploded ordnance in the adjacent area of the open cast mines.

  12. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Yihang Yin

    2015-08-01

    Full Text Available Wireless sensor networks (WSNs have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA. First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.

  13. Systematic analysis of whistler-mode emissions below the lower hybrid frequency based on the data of the Cluster project.

    Science.gov (United States)

    Nemec, F.; Santolik, O.; Gereova, K.; Macusova, E.; Cornilleau-Wehrlin, N.

    2003-12-01

    We report results of a systematic analysis of equatorial noise below the local lower hybrid frequency. Our analysis is based on the entire data set collected by the STAFF-SA instruments on board the Cluster spacecraft during the first two years of operation (2001 - 2002). We compare intensities of equatorial noise with other whistler-mode emissions, for example with chorus or hiss. The results indicate that these emissions can play a significant role in the dynamics of the inner magnetosphere. Using the multipoint measurement we show considerable spatio-temporal variations of the wave intensity.

  14. Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

    International Nuclear Information System (INIS)

    Zeng, J.; Li, G.; Sun, J.

    2013-01-01

    Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)

  15. Eating or meeting? Cluster analysis reveals intricacies of white shark (Carcharodon carcharias migration and offshore behavior.

    Directory of Open Access Journals (Sweden)

    Salvador J Jorgensen

    Full Text Available Elucidating how mobile ocean predators utilize the pelagic environment is vital to understanding the dynamics of oceanic species and ecosystems. Pop-up archival transmitting (PAT tags have emerged as an important tool to describe animal migrations in oceanic environments where direct observation is not feasible. Available PAT tag data, however, are for the most part limited to geographic position, swimming depth and environmental temperature, making effective behavioral observation challenging. However, novel analysis approaches have the potential to extend the interpretive power of these limited observations. Here we developed an approach based on clustering analysis of PAT daily time-at-depth histogram records to distinguish behavioral modes in white sharks (Carcharodon carcharias. We found four dominant and distinctive behavioral clusters matching previously described behavioral patterns, including two distinctive offshore diving modes. Once validated, we mapped behavior mode occurrence in space and time. Our results demonstrate spatial, temporal and sex-based structure in the diving behavior of white sharks in the northeastern Pacific previously unrecognized including behavioral and migratory patterns resembling those of species with lek mating systems. We discuss our findings, in combination with available life history and environmental data, and propose specific testable hypotheses to distinguish between mating and foraging in northeastern Pacific white sharks that can provide a framework for future work. Our methodology can be applied to similar datasets from other species to further define behaviors during unobservable phases.

  16. Shape Analysis of HII Regions - I. Statistical Clustering

    Science.gov (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  17. Time series clustering analysis of health-promoting behavior

    Science.gov (United States)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  18. Year clustering analysis for modelling olive flowering phenology

    Science.gov (United States)

    Oteros, J.; García-Mozo, H.; Hervás-Martínez, C.; Galán, C.

    2013-07-01

    It is now widely accepted that weather conditions occurring several months prior to the onset of flowering have a major influence on various aspects of olive reproductive phenology, including flowering intensity. Given the variable characteristics of the Mediterranean climate, we analyse its influence on the registered variations in olive flowering intensity in southern Spain, and relate them to previous climatic parameters using a year-clustering approach, as a first step towards an olive flowering phenology model adapted to different year categories. Phenological data from Cordoba province (Southern Spain) for a 30-year period (1982-2011) were analysed. Meteorological and phenological data were first subjected to both hierarchical and "K-means" clustering analysis, which yielded four year-categories. For this classification purpose, three different models were tested: (1) discriminant analysis; (2) decision-tree analysis; and (3) neural network analysis. Comparison of the results showed that the neural-networks model was the most effective, classifying four different year categories with clearly distinct weather features. Flowering-intensity models were constructed for each year category using the partial least squares regression method. These category-specific models proved to be more effective than general models. They are better suited to the variability of the Mediterranean climate, due to the different response of plants to the same environmental stimuli depending on the previous weather conditions in any given year. The present detailed analysis of the influence of weather patterns of different years on olive phenology will help us to understand the short-term effects of climate change on olive crop in the Mediterranean area that is highly affected by it.

  19. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

    Science.gov (United States)

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

    2014-07-01

    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  20. [Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].

    Science.gov (United States)

    Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés

    2018-03-06

    To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  1. A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.

    Science.gov (United States)

    Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina

    2014-07-15

    In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. A systematic analysis of sentence update detection for temporal summarization

    NARCIS (Netherlands)

    Gârbacea, C.; Kanoulas, E.; Jose, J.M.; Hauff, C.; Altıngovde, I.S.; Song, D.; Albakour, D.; Watt, S.; Tait, J.

    2017-01-01

    Temporal summarization algorithms filter large volumes of streaming documents and emit sentences that constitute salient event updates. Systems developed typically combine in an ad-hoc fashion traditional retrieval and document summarization algorithms to filter sentences inside documents. Retrieval

  3. Sensitization trajectories in childhood revealed by using a cluster analysis

    DEFF Research Database (Denmark)

    Schoos, Ann-Marie M.; Chawes, Bo L.; Melen, Erik

    2017-01-01

    BACKGROUND: Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more...... biologically and clinically relevant. OBJECTIVE: We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. METHODS: We investigated 398 children from the at-risk Copenhagen...... Prospective Studies on Asthma in Childhood 2000 (COPSAC2000) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent...

  4. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    Science.gov (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  5. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi

    2013-09-01

    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  6. Integrating PROOF Analysis in Cloud and Batch Clusters

    International Nuclear Information System (INIS)

    Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto

    2012-01-01

    High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.

  7. CLUSTER ANALYSIS OF NATURAL DISASTER LOSSES IN POLISH AGRICULTURE

    Directory of Open Access Journals (Sweden)

    Grzegorz STRUPCZEWSKI

    2015-04-01

    Full Text Available Agricultural production risk is of special nature due to a great number of hazards, relative weakness of production entities on the market and high ambiguity which is greater than in industrial production. Natural disasters occurring very frequently, at simultaneous low percentage of insured farmers, cause damage of such sizes that force the state to organise current financial aid (for instance in the form of preferential natural disaster loans. This aid is usually not sufficient. On the other hand, regional diversity of the risk level does not positively affect the development of insurance. From the perspective of insurance companies and policymakers it becomes highly important to investigate the spatial structure of losses in agriculture caused by natural disasters. The purpose of the research is to classify the 16 Polish voivodeships into clusters in order to show differences between them according to the criterion of level of damage in agricultural farms caused by natural disasters. On the basis of the cluster analysis it was demonstrated that 11 voivodeships form quite a homogeneous group in terms of size of damage in agriculture (the value of damage in cultivations and the acreage of destroyed cultivations are two most important factors determining affiliation to the cluster, however, the profile of loss occurring in other five voivodeships has a very individual course and requires separate handling in the actuarial sense. It was also proved that high value of losses in agriculture in the absolute sense in given voivodeships do not have to mean high vulnerability of agricultural farms from these voivodeships to natural risks.

  8. The Analysis of a Simple k-Means Clustering Algorithm

    National Research Council Canada - National Science Library

    Kanungo, T; Mount, D. M; Netanyahu, N. S; Piatko, C; Silverman, R; Wu, A. Y

    2000-01-01

    .... A popular heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm...

  9. A particle swarm optimized kernel-based clustering method for crop mapping from multi-temporal polarimetric L-band SAR observations

    Science.gov (United States)

    Tamiminia, Haifa; Homayouni, Saeid; McNairn, Heather; Safari, Abdoreza

    2017-06-01

    Polarimetric Synthetic Aperture Radar (PolSAR) data, thanks to their specific characteristics such as high resolution, weather and daylight independence, have become a valuable source of information for environment monitoring and management. The discrimination capability of observations acquired by these sensors can be used for land cover classification and mapping. The aim of this paper is to propose an optimized kernel-based C-means clustering algorithm for agriculture crop mapping from multi-temporal PolSAR data. Firstly, several polarimetric features are extracted from preprocessed data. These features are linear polarization intensities, and several statistical and physical based decompositions such as Cloude-Pottier, Freeman-Durden and Yamaguchi techniques. Then, the kernelized version of hard and fuzzy C-means clustering algorithms are applied to these polarimetric features in order to identify crop types. The kernel function, unlike the conventional partitioning clustering algorithms, simplifies the non-spherical and non-linearly patterns of data structure, to be clustered easily. In addition, in order to enhance the results, Particle Swarm Optimization (PSO) algorithm is used to tune the kernel parameters, cluster centers and to optimize features selection. The efficiency of this method was evaluated by using multi-temporal UAVSAR L-band images acquired over an agricultural area near Winnipeg, Manitoba, Canada, during June and July in 2012. The results demonstrate more accurate crop maps using the proposed method when compared to the classical approaches, (e.g. 12% improvement in general). In addition, when the optimization technique is used, greater improvement is observed in crop classification, e.g. 5% in overall. Furthermore, a strong relationship between Freeman-Durden volume scattering component, which is related to canopy structure, and phenological growth stages is observed.

  10. Cluster analysis of BI-RADS descriptions of biopsy-proven breast lesions

    Science.gov (United States)

    Markey, Mia K.; Lo, Joseph Y.; Tourassi, Georgia D.; Floyd, Carey E., Jr.

    2002-05-01

    The purpose of this study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database. Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building. Agglomerative hierarchical clustering and k-means clustering were used to identify clusters in a large, heterogeneous computer-aided diagnosis database based on mammographic findings (BI-RADS) and patient age. The clusters were examined in terms of their feature distributions. The clusters showed logical separation of distinct clinical subtypes such as architectural distortions, masses, and calcifications. Moreover, the common subtypes of masses and calcifications were stratified into clusters based on age groupings. The percent of the cases that were malignant was notably different among the clusters. Cluster analysis can provide a powerful tool in discerning the subgroups present in a large, heterogeneous computer-aided diagnosis database.

  11. CHOOSING A HEALTH INSTITUTION WITH MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS IN A POPULATION BASED STUDY

    Directory of Open Access Journals (Sweden)

    ASLI SUNER

    2013-06-01

    Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.

  12. MMPI profiles of males accused of severe crimes: a cluster analysis

    NARCIS (Netherlands)

    Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.

    2009-01-01

    In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious

  13. Leishmaniasis, conflict, and political terror: A spatio-temporal analysis.

    Science.gov (United States)

    Berry, Isha; Berrang-Ford, Lea

    2016-10-01

    Leishmaniasis has been estimated to cause the ninth largest burden amongst global infectious diseases. Occurrence of the disease has been anecdotally associated with periods of conflict, leading to its referral as a disease of 'guerrilla warfare.' Despite this, there have been few studies that quantitatively investigate the extent to which leishmaniasis coincides with conflict or political terror. This study employed a longitudinal approach to empirically test for an association between cutaneous and visceral leishmaniasis incidence with occurrence of conflict and political terror at the national level, annually for 15 years (1995-2010). Leishmaniasis incidence data were collected for 54 countries, and combined with UCDP/PRIO Armed Conflict and Amnesty International political terror datasets. Mixed effects negative binomial regression models clustered at the country-level were constructed to evaluate the incidence rate ratios against the predictors, while controlling for wealth. Additionally, to understand how and why conflict-terror may be associated with leishmaniasis incidence, we conducted a historical analysis. We identify and discuss posited causal mechanisms in the literature, and critically assessed pathways by which leishmaniasis might occur in places and times of conflict-terror. There was a significant dose-response relationship for disease incidence based on increasing levels of conflict and terror. Country-years experiencing very high levels of conflict-terror were associated with a 2.38 times higher [95% CI: 1.40-4.05] and 6.02 times higher [95% CI: 2.39-15.15] incidence of cutaneous and visceral leishmaniasis, respectively. Historical analysis indicated that conflict and terror contribute to-or coincide with-leishmaniasis incidence through processes of population displacement and health system deterioration. This research highlights the potentially increased risks for cutaneous and visceral leishmaniasis incidence in areas of high conflict

  14. Cluster Method Analysis of K. S. C. Image

    Science.gov (United States)

    Rodriguez, Joe, Jr.; Desai, M.

    1997-01-01

    Information obtained from satellite-based systems has moved to the forefront as a method in the identification of many land cover types. Identification of different land features through remote sensing is an effective tool for regional and global assessment of geometric characteristics. Classification data acquired from remote sensing images have a wide variety of applications. In particular, analysis of remote sensing images have special applications in the classification of various types of vegetation. Results obtained from classification studies of a particular area or region serve towards a greater understanding of what parameters (ecological, temporal, etc.) affect the region being analyzed. In this paper, we make a distinction between both types of classification approaches although, focus is given to the unsupervised classification method using 1987 Thematic Mapped (TM) images of Kennedy Space Center.

  15. Independent component analysis to detect clustered microcalcification breast cancers.

    Science.gov (United States)

    Gallardo-Caballero, R; García-Orellana, C J; García-Manso, A; González-Velasco, H M; Macías-Macías, M

    2012-01-01

    The presence of clustered microcalcifications is one of the earliest signs in breast cancer detection. Although there exist many studies broaching this problem, most of them are nonreproducible due to the use of proprietary image datasets. We use a known subset of the currently largest publicly available mammography database, the Digital Database for Screening Mammography (DDSM), to develop a computer-aided detection system that outperforms the current reproducible studies on the same mammogram set. This proposal is mainly based on the use of extracted image features obtained by independent component analysis, but we also study the inclusion of the patient's age as a nonimage feature which requires no human expertise. Our system achieves an average of 2.55 false positives per image at a sensitivity of 81.8% and 4.45 at a sensitivity of 91.8% in diagnosing the BCRP_CALC_1 subset of DDSM.

  16. Higgs Pair Production: Choosing Benchmarks With Cluster Analysis

    CERN Document Server

    Carvalho, Alexandra; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia

    2016-01-01

    New physics theories often depend on a large number of free parameters. The precise values of those parameters in some cases drastically affect the resulting phenomenology of fundamental physics processes, while in others finite variations can leave it basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics of different models; a clustering algorithm using that metric may then allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmark points are then guaranteed to be sensitive to a large area of the parameter space. In this doc...

  17. Entropy-rate clustering: cluster analysis via maximizing a submodular function subject to a matroid constraint.

    Science.gov (United States)

    Liu, Ming-Yu; Tuzel, Oncel; Ramalingam, Srikumar; Chellappa, Rama

    2014-01-01

    We propose a new objective function for clustering. This objective function consists of two components: the entropy rate of a random walk on a graph and a balancing term. The entropy rate favors formation of compact and homogeneous clusters, while the balancing function encourages clusters with similar sizes and penalizes larger clusters that aggressively group samples. We present a novel graph construction for the graph associated with the data and show that this construction induces a matroid--a combinatorial structure that generalizes the concept of linear independence in vector spaces. The clustering result is given by the graph topology that maximizes the objective function under the matroid constraint. By exploiting the submodular and monotonic properties of the objective function, we develop an efficient greedy algorithm. Furthermore, we prove an approximation bound of (1/2) for the optimality of the greedy solution. We validate the proposed algorithm on various benchmarks and show its competitive performances with respect to popular clustering algorithms. We further apply it for the task of superpixel segmentation. Experiments on the Berkeley segmentation data set reveal its superior performances over the state-of-the-art superpixel segmentation algorithms in all the standard evaluation metrics.

  18. The relationship between supplier networks and industrial clusters: an analysis based on the cluster mapping method

    Directory of Open Access Journals (Sweden)

    Ichiro IWASAKI

    2010-06-01

    Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.

  19. Analysis of Spatio-Temporal Traffic Patterns Based on Pedestrian Trajectories

    Science.gov (United States)

    Busch, S.; Schindler, T.; Klinger, T.; Brenner, C.

    2016-06-01

    For driver assistance and autonomous driving systems, it is essential to predict the behaviour of other traffic participants. Usually, standard filter approaches are used to this end, however, in many cases, these are not sufficient. For example, pedestrians are able to change their speed or direction instantly. Also, there may be not enough observation data to determine the state of an object reliably, e.g. in case of occlusions. In those cases, it is very useful if a prior model exists, which suggests certain outcomes. For example, it is useful to know that pedestrians are usually crossing the road at a certain location and at certain times. This information can then be stored in a map which then can be used as a prior in scene analysis, or in practical terms to reduce the speed of a vehicle in advance in order to minimize critical situations. In this paper, we present an approach to derive such a spatio-temporal map automatically from the observed behaviour of traffic participants in everyday traffic situations. In our experiments, we use one stationary camera to observe a complex junction, where cars, public transportation and pedestrians interact. We concentrate on the pedestrians trajectories to map traffic patterns. In the first step, we extract trajectory segments from the video data. These segments are then clustered in order to derive a spatial model of the scene, in terms of a spatially embedded graph. In the second step, we analyse the temporal patterns of pedestrian movement on this graph. We are able to derive traffic light sequences as well as the timetables of nearby public transportation. To evaluate our approach, we used a 4 hour video sequence. We show that we are able to derive traffic light sequences as well as time tables of nearby public transportation.

  20. ANALYSIS OF SPATIO-TEMPORAL TRAFFIC PATTERNS BASED ON PEDESTRIAN TRAJECTORIES

    Directory of Open Access Journals (Sweden)

    S. Busch

    2016-06-01

    Full Text Available For driver assistance and autonomous driving systems, it is essential to predict the behaviour of other traffic participants. Usually, standard filter approaches are used to this end, however, in many cases, these are not sufficient. For example, pedestrians are able to change their speed or direction instantly. Also, there may be not enough observation data to determine the state of an object reliably, e.g. in case of occlusions. In those cases, it is very useful if a prior model exists, which suggests certain outcomes. For example, it is useful to know that pedestrians are usually crossing the road at a certain location and at certain times. This information can then be stored in a map which then can be used as a prior in scene analysis, or in practical terms to reduce the speed of a vehicle in advance in order to minimize critical situations. In this paper, we present an approach to derive such a spatio-temporal map automatically from the observed behaviour of traffic participants in everyday traffic situations. In our experiments, we use one stationary camera to observe a complex junction, where cars, public transportation and pedestrians interact. We concentrate on the pedestrians trajectories to map traffic patterns. In the first step, we extract trajectory segments from the video data. These segments are then clustered in order to derive a spatial model of the scene, in terms of a spatially embedded graph. In the second step, we analyse the temporal patterns of pedestrian movement on this graph. We are able to derive traffic light sequences as well as the timetables of nearby public transportation. To evaluate our approach, we used a 4 hour video sequence. We show that we are able to derive traffic light sequences as well as time tables of nearby public transportation.

  1. Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

    Science.gov (United States)

    Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

    2017-05-01

    The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.

  2. Evaluation of bitterness in white wine applying descriptive analysis, time-intensity analysis, and temporal dominance of sensations analysis.

    Science.gov (United States)

    Sokolowsky, Martina; Fischer, Ulrich

    2012-06-30

    Bitterness in wine, especially in white wine, is a complex and sensitive topic as it is a persistent sensation with negative connotation by consumers. However, the molecular base for bitter taste in white wines is still widely unknown yet. At the same time studies dealing with bitterness have to cope with the temporal dynamics of bitter perception. The most common method to describe bitter taste is the static measurement amongst other attributes during a descriptive analysis. A less frequently applied method, the time-intensity analysis, evaluates the temporal gustatory changes focusing on bitterness alone. The most recently developed multidimensional approach of the temporal dominance of sensations method reveals the temporal dominance of bitter taste in relation to other attributes. In order to compare the results comprised with these different sensory methodologies, 13 commercial white wines were evaluated by the same panel. To facilitate a statistical comparison, parameters were extracted from bitterness curves obtained from time-intensity and temporal dominance of sensations analysis and were compared to bitter intensity as well as bitter persistency based on descriptive analysis. Analysis of variance differentiated significantly the wines regarding all measured bitterness parameters obtained from the three sensory techniques. Comparing the information of all sensory parameters by multiple factor analysis and correlation, each technique provided additional valuable information regarding the complex bitter perception in white wine. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. fMR-adaptation indicates selectivity to audiovisual content congruency in distributed clusters in human superior temporal cortex

    NARCIS (Netherlands)

    van Atteveldt, Nienke M; Blau, Vera C; Blomert, Leo; Goebel, Rainer

    2010-01-01

    BACKGROUND: Efficient multisensory integration is of vital importance for adequate interaction with the environment. In addition to basic binding cues like temporal and spatial coherence, meaningful multisensory information is also bound together by content-based associations. Many functional

  4. Statistical analysis of long term spatial and temporal trends of ...

    Indian Academy of Sciences (India)

    ). It is not uniform either spatially or temporally over the Himalayan region. A positive relationship between altitude and warming rate has been observed over the Greater. Keywords. Temperature; CGCM3; HadCM3; modified Mann–Kendall test ...

  5. and remote sensing for multi-temporal analysis of sand ...

    African Journals Online (AJOL)

    dalel

    nation of “sand” and photo-interpretation for identification of sandy soils is useful to assess wind erosion dynamics through the mapping of the spatial and temporal evolution of sandy soil. According to these results, Oglet Merteba is more affected by sand accumulation over the last de- cade. The more aggressive period is ...

  6. Cubic map algebra functions for spatio-temporal analysis

    Science.gov (United States)

    Mennis, J.; Viger, R.; Tomlin, C.D.

    2005-01-01

    We propose an extension of map algebra to three dimensions for spatio-temporal data handling. This approach yields a new class of map algebra functions that we call "cube functions." Whereas conventional map algebra functions operate on data layers representing two-dimensional space, cube functions operate on data cubes representing two-dimensional space over a third-dimensional period of time. We describe the prototype implementation of a spatio-temporal data structure and selected cube function versions of conventional local, focal, and zonal map algebra functions. The utility of cube functions is demonstrated through a case study analyzing the spatio-temporal variability of remotely sensed, southeastern U.S. vegetation character over various land covers and during different El Nin??o/Southern Oscillation (ENSO) phases. Like conventional map algebra, the application of cube functions may demand significant data preprocessing when integrating diverse data sets, and are subject to limitations related to data storage and algorithm performance. Solutions to these issues include extending data compression and computing strategies for calculations on very large data volumes to spatio-temporal data handling.

  7. A novel analysis of spring phenological patterns over Europe based on co-clustering

    NARCIS (Netherlands)

    Wu, X.; Zurita-Milla, R.; Kraak, M.J.

    2016-01-01

    The study of phenological patterns and their dynamics provides insights into the impacts of climate change on terrestrial ecosystems. Here we present a novel analytical workflow, based on co-clustering, that enables the concurrent study of spatio-temporal patterns in spring phenology. The workflow

  8. Performance analysis of clustering techniques over microarray data: A case study

    Science.gov (United States)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  9. Depth data research of GIS based on clustering analysis algorithm

    Science.gov (United States)

    Xiong, Yan; Xu, Wenli

    2018-03-01

    The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.

  10. A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Xiaowei Li

    2017-01-01

    Full Text Available A large number of studies demonstrated that major depressive disorder (MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” But, in the present study, the approach of constructing functional connectivity is often biased by the choice of the threshold. Besides, more attention was paid to the number and length of links in brain networks, and the clustering partitioning of nodes was unclear. Therefore, minimum spanning tree (MST analysis and the hierarchical clustering were first used for the depression disease in this study. Resting-state electroencephalogram (EEG sources were assessed from 15 healthy and 23 major depressive subjects. Then the coherence, MST, and the hierarchical clustering were obtained. In the theta band, coherence analysis showed that the EEG coherence of the MDD patients was significantly higher than that of the healthy controls especially in the left temporal region. The MST results indicated the higher leaf fraction in the depressed group. Compared with the normal group, the major depressive patients lost clustering in frontal regions. Our findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

  11. A model of photon cell killing based on the spatio-temporal clustering of DNA damage in higher order chromatin structures.

    Directory of Open Access Journals (Sweden)

    Lisa Herr

    Full Text Available We present a new approach to model dose rate effects on cell killing after photon radiation based on the spatio-temporal clustering of DNA double strand breaks (DSBs within higher order chromatin structures of approximately 1-2 Mbp size, so called giant loops. The main concept of this approach consists of a distinction of two classes of lesions, isolated and clustered DSBs, characterized by the number of double strand breaks induced in a giant loop. We assume a low lethality and fast component of repair for isolated DSBs and a high lethality and slow component of repair for clustered DSBs. With appropriate rates, the temporal transition between the different lesion classes is expressed in terms of five differential equations. These allow formulating the dynamics involved in the competition of damage induction and repair for arbitrary dose rates and fractionation schemes. Final cell survival probabilities are computable with a cell line specific set of three parameters: The lethality for isolated DSBs, the lethality for clustered DSBs and the half-life time of isolated DSBs. By comparison with larger sets of published experimental data it is demonstrated that the model describes the cell line dependent response to treatments using either continuous irradiation at a constant dose rate or to split dose irradiation well. Furthermore, an analytic investigation of the formulation concerning single fraction treatments with constant dose rates in the limiting cases of extremely high or low dose rates is presented. The approach is consistent with the Linear-Quadratic model extended by the Lea-Catcheside factor up to the second moment in dose. Finally, it is shown that the model correctly predicts empirical findings about the dose rate dependence of incidence probabilities for deterministic radiation effects like pneumonitis and the bone marrow syndrome. These findings further support the general concepts on which the approach is based.

  12. Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

    Directory of Open Access Journals (Sweden)

    Marco Borri

    Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.

  13. Analysis of the dynamical cluster approximation for the Hubbard model

    OpenAIRE

    Aryanpour, K.; Hettler, M. H.; Jarrell, M.

    2002-01-01

    We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...

  14. Spatio-temporal analysis of malaria vectors in national malaria surveillance sites in China.

    Science.gov (United States)

    Huang, Ji-Xia; Xia, Zhi-Gui; Zhou, Shui-Sen; Pu, Xiao-Jun; Hu, Mao-Gui; Huang, Da-Cang; Ren, Zhou-Peng; Zhang, Shao-Sen; Yang, Man-Ni; Wang, Duo-Quan; Wang, Jin-Feng

    2015-03-07

    To reveal the spatio-temporal distribution of malaria vectors in the national malaria surveillance sites from 2005 to 2010 and provide reference for the current National Malaria Elimination Programme (NMEP) in China. A 6-year longitudinal surveillance on density of malaria vectors was carried out in the 62 national malaria surveillance sites. The spatial and temporal analyses of the four primary vectors distribution were conducted by the methods of kernel k-means and the cluster distribution of the most widely distribution vector of An.sinensis was identified using the empirical mode decomposition (EMD). Totally 4 species of Anopheles mosquitoes including An.sinensis, An.lesteri, An.dirus and An.minimus were captured with significant difference of distribution as well as density. An. sinensis was the most widely distributed, accounting for 96.25% of all collections, and its distribution was divided into three different clusters with a significant increase of density observed in the second cluster which located mostly in the central parts of China. This study first described the spatio-temporal distribution of malaria vectors based on the nationwide surveillance during 2005-2010, which served as a baseline for the ongoing national malaria elimination program.

  15. Spectral clustering for water body spectral types analysis

    Science.gov (United States)

    Huang, Leping; Li, Shijin; Wang, Lingli; Chen, Deqing

    2017-11-01

    In order to study the spectral types of water body in the whole country, the key issue of reservoir research is to obtain and to analyze the information of water body in the reservoir quantitatively and accurately. A new type of weight matrix is constructed by utilizing the spectral features and spatial features of the spectra from GF-1 remote sensing images comprehensively. Then an improved spectral clustering algorithm is proposed based on this weight matrix to cluster representative reservoirs in China. According to the internal clustering validity index which called Davies-Bouldin(DB) index, the best clustering number 7 is obtained. Compared with two clustering algorithms, the spectral clustering algorithm based only on spectral features and the K-means algorithm based on spectral features and spatial features, simulation results demonstrate that the proposed spectral clustering algorithm based on spectral features and spatial features has a higher clustering accuracy, which can better reflect the spatial clustering characteristics of representative reservoirs in various provinces in China - similar spectral properties and adjacent geographical locations.

  16. X-Ray Morphological Analysis of the Planck ESZ Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Andrade-Santos, Felipe; Randall, Scott; Kraft, Ralph [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ettori, Stefano [INAF, Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127 Bologna (Italy); Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DRF—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France)

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  17. Fuzzy and hard clustering analysis for thyroid disease.

    Science.gov (United States)

    Azar, Ahmad Taher; El-Said, Shaimaa Ahmed; Hassanien, Aboul Ella

    2013-07-01

    Thyroid hormones produced by the thyroid gland help regulation of the body's metabolism. A variety of methods have been proposed in the literature for thyroid disease classification. As far as we know, clustering techniques have not been used in thyroid diseases data set so far. This paper proposes a comparison between hard and fuzzy clustering algorithms for thyroid diseases data set in order to find the optimal number of clusters. Different scalar validity measures are used in comparing the performances of the proposed clustering systems. To demonstrate the performance of each algorithm, the feature values that represent thyroid disease are used as input for the system. Several runs are carried out and recorded with a different number of clusters being specified for each run (between 2 and 11), so as to establish the optimum number of clusters. To find the optimal number of clusters, the so-called elbow criterion is applied. The experimental results revealed that for all algorithms, the elbow was located at c=3. The clustering results for all algorithms are then visualized by the Sammon mapping method to find a low-dimensional (normally 2D or 3D) representation of a set of points distributed in a high dimensional pattern space. At the end of this study, some recommendations are formulated to improve determining the actual number of clusters present in the data set. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. Frailty phenotypes in the elderly based on cluster analysis

    DEFF Research Database (Denmark)

    Dato, Serena; Montesanto, Alberto; Lagani, Vincenzo

    2012-01-01

    Frailty is a physiological state characterized by the deregulation of multiple physiologic systems of an aging organism determining the loss of homeostatic capacity, which exposes the elderly to disability, diseases, and finally death. An operative definition of frailty, useful for the classifica......Frailty is a physiological state characterized by the deregulation of multiple physiologic systems of an aging organism determining the loss of homeostatic capacity, which exposes the elderly to disability, diseases, and finally death. An operative definition of frailty, useful...... genetic background on the frailty status is still questioned. We investigated the applicability of a cluster analysis approach based on specific geriatric parameters, previously set up and validated in a southern Italian population, to two large longitudinal Danish samples. In both cohorts, we identified...... groups of subjects homogeneous for their frailty status and characterized by different survival patterns. A subsequent survival analysis availing of Accelerated Failure Time models allowed us to formulate an operative index able to correlate classification variables with survival probability. From...

  19. Comprehensive analysis of tornado statistics in comparison to earthquakes: intensity and temporal behaviour

    Directory of Open Access Journals (Sweden)

    L. Schielicke

    2013-01-01

    Full Text Available Tornadoes and earthquakes are characterised by a high variability in their properties concerning intensity, geometric properties and temporal behaviour. Earthquakes are known for power-law behaviour in their intensity (Gutenberg–Richter law and temporal statistics (e.g. Omori law and interevent waiting times. The observed similarity of high variability of these two phenomena motivated us to compare the statistical behaviour of tornadoes using seismological methods and quest for power-law behaviour. In general, the statistics of tornadoes show power-law behaviour partly coextensive with characteristic scales when the temporal resolution is high (10 to 60 min. These characteristic scales match with the typical diurnal behaviour of tornadoes, which is characterised by a maximum of tornado occurrences in the late afternoon hours. Furthermore, the distributions support the observation that tornadoes cluster in time. Finally, we shortly discuss a possible similar underlying structure composed of heterogeneous, coupled, interactive threshold oscillators that possibly explains the observed behaviour.

  20. Geovisualization Approaches for Spatio-temporal Crime Scene Analysis - Towards 4D Crime Mapping

    Science.gov (United States)

    Wolff, Markus; Asche, Hartmut

    This paper presents a set of methods and techniques for analysis and multidimensional visualisation of crime scenes in a German city. As a first step the approach implies spatio-temporal analysis of crime scenes. Against this background a GIS-based application is developed that facilitates discovering initial trends in spatio-temporal crime scene distributions even for a GIS untrained user. Based on these results further spatio-temporal analysis is conducted to detect variations of certain hotspots in space and time. In a next step these findings of crime scene analysis are integrated into a geovirtual environment. Behind this background the concept of the space-time cube is adopted to allow for visual analysis of repeat burglary victimisation. Since these procedures require incorporating temporal elements into virtual 3D environments, basic methods for 4D crime scene visualisation are outlined in this paper.

  1. Adapting Spectral Co-clustering to Documents and Terms Using Latent Semantic Analysis

    Science.gov (United States)

    Park, Laurence A. F.; Leckie, Christopher A.; Ramamohanarao, Kotagiri; Bezdek, James C.

    Spectral co-clustering is a generic method of computing co-clusters of relational data, such as sets of documents and their terms. Latent semantic analysis is a method of document and term smoothing that can assist in the information retrieval process. In this article we examine the process behind spectral clustering for documents and terms, and compare it to Latent Semantic Analysis. We show that both spectral co-clustering and LSA follow the same process, using different normalisation schemes and metrics. By combining the properties of the two co-clustering methods, we obtain an improved co-clustering method for document-term relational data that provides an increase in the cluster quality of 33.0%.

  2. Cluster Analysis: Unsupervised Learning via Supervised Learning with a Non-convex Penalty.

    Science.gov (United States)

    Pan, Wei; Shen, Xiaotong; Liu, Binghui

    2013-07-01

    Clustering analysis is widely used in many fields. Traditionally clustering is regarded as unsupervised learning for its lack of a class label or a quantitative response variable, which in contrast is present in supervised learning such as classification and regression. Here we formulate clustering as penalized regression with grouping pursuit. In addition to the novel use of a non-convex group penalty and its associated unique operating characteristics in the proposed clustering method, a main advantage of this formulation is its allowing borrowing some well established results in classification and regression, such as model selection criteria to select the number of clusters, a difficult problem in clustering analysis. In particular, we propose using the generalized cross-validation (GCV) based on generalized degrees of freedom (GDF) to select the number of clusters. We use a few simple numerical examples to compare our proposed method with some existing approaches, demonstrating our method's promising performance.

  3. Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

    Science.gov (United States)

    Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

    2017-10-01

    Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  4. An effective fuzzy kernel clustering analysis approach for gene expression data.

    Science.gov (United States)

    Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

    2015-01-01

    Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.

  5. ANALYSIS OF DEVELOPING BATIK INDUSTRY CLUSTER IN BAKARAN VILLAGE CENTRAL JAVA PROVINCE

    Directory of Open Access Journals (Sweden)

    Hermanto Hermanto

    2017-06-01

    Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.

  6. MULTI-TEMPORAL ANALYSIS OF LANDSCAPES AND URBAN AREAS

    Directory of Open Access Journals (Sweden)

    E. Nocerino

    2012-07-01

    Full Text Available This article presents a 4D modelling approach that employs multi-temporal and historical aerial images to derive spatio-temporal information for scenes and landscapes. Such imagery represent a unique data source, which combined with photo interpretation and reality-based 3D reconstruction techniques, can offer a more complete modelling procedure because it adds the fourth dimension of time to 3D geometrical representation and thus, allows urban planners, historians, and others to identify, describe, and analyse changes in individual scenes and buildings as well as across landscapes. Particularly important to this approach are historical aerial photos, which provide data about the past that can be collected, processed, and then integrated as a database. The proposed methodology employs both historical (1945 and more recent (1973 and 2000s aerial images from the Trentino region in North-eastern Italy in order to create a multi-temporal database of information to assist researchers in many disciplines such as topographic mapping, geology, geography, architecture, and archaeology as they work to reconstruct building phases and to understand landscape transformations (Fig. 1.

  7. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results

    Directory of Open Access Journals (Sweden)

    Medvedovic Mario

    2011-01-01

    Full Text Available Abstract Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  9. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  10. Classification and volumetric analysis of temporal bone pneumatization using cone beam computed tomography.

    Science.gov (United States)

    Jadhav, Aniket B; Fellows, Douglas; Hand, Arthur R; Tadinada, Aditya; Lurie, Alan G

    2014-03-01

    This study performed volumetric analysis and classified different repeated patterns of temporal bone pneumatization in adults using cone beam computed tomography (CBCT) scans. A total of 155 temporal bones were retrospectively evaluated from 78 patients with no radiographic evidence of pathology. Two reference structures were used to classify temporal bone pneumatization into 3 groups. Volumetric analysis of the pneumatization was performed using a window thresholding procedure on multiplanar CBCT images. Correlation between direct communication of peritubal cells with the eustachian tube and the degree of pneumatization was also assessed. Using 2 reference structures, pneumatization pattern in the temporal bone can be classified into 3 groups. Statistically significant differences were present in their mean volumes between 3 groups. Statistically significant correlation was found between degree of pneumatization and presence of peritubal cells associated with ET. This study showed that CBCT can be effectively used for imaging temporal bone air cavities and for volumetric assessment. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Participant intimacy: A cluster analysis of the intranuclear cascade

    International Nuclear Information System (INIS)

    Cugnon, J.; Knoll, J.; Randrup, J.

    1981-01-01

    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of clusters consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ( rows-on-rows ). (orig.)

  12. Participant intimacy A cluster analysis of the intranuclear cascadet

    Science.gov (United States)

    Cugnon, J.; Knoll, J.; Randrup, J.

    1981-05-01

    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of "clusters" consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ("rows-on-rows").

  13. An evaluation of centrality measures used in cluster analysis

    Science.gov (United States)

    Engström, Christopher; Silvestrov, Sergei

    2014-12-01

    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  14. Cluster analysis of HZE particle tracks as applied to space radiobiology problems

    International Nuclear Information System (INIS)

    Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.

    2013-01-01

    A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer

  15. Application of cluster analysis and unsupervised learning to multivariate tissue characterization

    International Nuclear Information System (INIS)

    Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.

    1987-01-01

    This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data

  16. Spatio-temporal analysis of the national parks in Nigeria using ...

    African Journals Online (AJOL)

    Spatio-temporal analysis of the national parks in Nigeria using geographic information system. S O Mohammed, E N Gajere, E O Eguaroje, H Shaba, J O Ogbole, Y S Mangut, N D Onyeuwaoma, I S Kolawole ...

  17. Spatio-temporal analysis of sub-hourly rainfall over Mumbai, India: Is ...

    Indian Academy of Sciences (India)

    temporal analysis of sub-hourly rainfall over Mumbai, India: Is statistical forecasting futile? Jitendra Singh Sheeba Sekharan Subhankar Karmakar Subimal Ghosh P E Zope T I Eldho. Volume 126 Issue 3 April 2017 Article ID 38 ...

  18. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    Science.gov (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  19. Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp. RFLP/PCR data set

    Science.gov (United States)

    Milagre, S. T.; Maciel, C. D.; Shinoda, A. A.; Hungria, M.; Almeida, J. R. B.

    2009-05-01

    The taxonomy of the N2-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradyrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster.

  20. Spatio-temporal Analysis of African Swine Fever in Sardinia (2012-2014): Trends in Domestic Pigs and Wild Boar.

    Science.gov (United States)

    Iglesias, I; Rodríguez, A; Feliziani, F; Rolesu, S; de la Torre, A

    2017-04-01

    African swine fever (ASF) is a notifiable viral disease affecting domestic pigs and wild boars that has been endemic in Sardinia since 1978. Several risk factors complicate the control of ASF in Sardinia: generally poor level of biosecurity, traditional breeding practices, illegal behaviour in movements and feeding of pigs, and sporadic occurrence of long-term carriers. A previous study describes the disease in Sardinia during 1978-2013. The aim of this study was to gain more in-depth knowledge of the spatio-temporal pattern of ASF in Sardinia during 2012 to May 2014, comparing patterns of occurrence in domestic pigs and wild boar and identifying areas of local transmission. African swine fever notifications were studied considering seasonality, spatial autocorrelation, spatial point pattern and spatio-temporal clusters. Results showed differences in temporal and spatial pattern of wild boar and domestic pig notifications. The peak in wild boar notifications (October 2013 to February 2014) occurred six months after than in domestic pig (May to early summer 2013). Notifications of cases in both host species tended to be clustered, with a maximum significant distance of spatial association of 15 and 25 km in domestic pigs and wild boars, respectively. Five clusters for local ASF transmission were identified for domestic pigs, with a mean radius and duration of 4 km (3-9 km) and 38 days (6-55 days), respectively. Any wild boar clusters were found. The apparently secondary role of wild boar in ASF spread in Sardinia could be explained by certain socio-economic factors (illegal free-range pig breeding or the mingling of herds. The lack of effectiveness of previous surveillance and control programmes reveals the necessity of employing a new approach). Results present here provide better knowledge of the dynamics of ASF in Sardinia, which could be used in a more comprehensive risk analysis necessary to introduce a new approach in the eradication strategy. © 2015

  1. Performance Analysis of Cluster Formation in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Edgar Romo Montiel

    2017-12-01

    Full Text Available Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  2. Performance Analysis of Cluster Formation in Wireless Sensor Networks.

    Science.gov (United States)

    Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo

    2017-12-13

    Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  3. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    Science.gov (United States)

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  4. Cluster Computing For Real Time Seismic Array Analysis.

    Science.gov (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  5. Phenotypic clustering: a novel method for microglial morphology analysis.

    Science.gov (United States)

    Verdonk, Franck; Roux, Pascal; Flamant, Patricia; Fiette, Laurence; Bozza, Fernando A; Simard, Sébastien; Lemaire, Marc; Plaud, Benoit; Shorte, Spencer L; Sharshar, Tarek; Chrétien, Fabrice; Danckaert, Anne

    2016-06-17

    Microglial cells are tissue-resident macrophages of the central nervous system. They are extremely dynamic, sensitive to their microenvironment and present a characteristic complex and heterogeneous morphology and distribution within the brain tissue. Many experimental clues highlight a strong link between their morphology and their function in response to aggression. However, due to their complex "dendritic-like" aspect that constitutes the major pool of murine microglial cells and their dense network, precise and powerful morphological studies are not easy to realize and complicate correlation with molecular or clinical parameters. Using the knock-in mouse model CX3CR1(GFP/+), we developed a 3D automated confocal tissue imaging system coupled with morphological modelling of many thousands of microglial cells revealing precise and quantitative assessment of major cell features: cell density, cell body area, cytoplasm area and number of primary, secondary and tertiary processes. We determined two morphological criteria that are the complexity index (CI) and the covered environment area (CEA) allowing an innovative approach lying in (i) an accurate and objective study of morphological changes in healthy or pathological condition, (ii) an in situ mapping of the microglial distribution in different neuroanatomical regions and (iii) a study of the clustering of numerous cells, allowing us to discriminate different sub-populations. Our results on more than 20,000 cells by condition confirm at baseline a regional heterogeneity of the microglial distribution and phenotype that persists after induction of neuroinflammation by systemic injection of lipopolysaccharide (LPS). Using clustering analysis, we highlight that, at resting state, microglial cells are distributed in four microglial sub-populations defined by their CI and CEA with a regional pattern and a specific behaviour after challenge. Our results counteract the classical view of a homogenous regional resting

  6. Global classification of human facial healthy skin using PLS discriminant analysis and clustering analysis.

    Science.gov (United States)

    Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J

    2001-04-01

    Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.

  7. Clustering analysis of malware behavior using Self Organizing Map

    DEFF Research Database (Denmark)

    Pirscoveanu, Radu-Stefan; Stevanovic, Matija; Pedersen, Jens Myrup

    2016-01-01

    For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing...... Map, an unsupervised machine learning algorithm, for generating clusters that capture the similarities between malware behavior. A data set of approximately 270,000 samples was used to generate the behavioral profile of malicious types in order to compare the outcome of the proposed clustering...... accurate results based on the clusters created by competitive and cooperative algorithms like Self Organizing Map that better describe the behavioral profile of malware....

  8. Automated classification of mouse pup isolation syllables: from cluster analysis to an Excel based ‘mouse pup syllable classification calculator’

    Directory of Open Access Journals (Sweden)

    Jasmine eGrimsley

    2013-01-01

    Full Text Available Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified ten syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.

  9. Application and research of fuzzy clustering analysis algorithm under “micro-lecture” English teaching mode

    Directory of Open Access Journals (Sweden)

    Shi Ying

    2016-01-01

    Full Text Available The fuzzy clustering algorithm is to classify the data or indicators with a greater degree of similarity based on the principle of the same type of individuals possessing a greater similarity, and different types of individuals possessing differences, establish clear category boundaries, form any shape of relationship clusters in the solving process, and input the research indicators at random, in order to accurately analyze the significance of the indicators in the algorithm. The evaluation value of the clustering analysis can be obtained by the establishment of the fuzzy factor set based on the membership analysis, and the evaluation result can be analyzed through reference to the evaluation indicators of the fuzzy clustering analysis. The “micro-lecture” English teaching mode can be estimated and the analysis indicators can be rationally established based on the fuzzy clustering analysis algorithm, with better algorithm applicability.

  10. Real-Time EEG Signal Enhancement Using Canonical Correlation Analysis and Gaussian Mixture Clustering

    Directory of Open Access Journals (Sweden)

    Chin-Teng Lin

    2018-01-01

    Full Text Available Electroencephalogram (EEG signals are usually contaminated with various artifacts, such as signal associated with muscle activity, eye movement, and body motion, which have a noncerebral origin. The amplitude of such artifacts is larger than that of the electrical activity of the brain, so they mask the cortical signals of interest, resulting in biased analysis and interpretation. Several blind source separation methods have been developed to remove artifacts from the EEG recordings. However, the iterative process for measuring separation within multichannel recordings is computationally intractable. Moreover, manually excluding the artifact components requires a time-consuming offline process. This work proposes a real-time artifact removal algorithm that is based on canonical correlation analysis (CCA, feature extraction, and the Gaussian mixture model (GMM to improve the quality of EEG signals. The CCA was used to decompose EEG signals into components followed by feature extraction to extract representative features and GMM to cluster these features into groups to recognize and remove artifacts. The feasibility of the proposed algorithm was demonstrated by effectively removing artifacts caused by blinks, head/body movement, and chewing from EEG recordings while preserving the temporal and spectral characteristics of the signals that are important to cognitive research.

  11. Spatial-Temporal Analysis on Spring Festival Travel Rush in China Based on Multisource Big Data

    Directory of Open Access Journals (Sweden)

    Jiwei Li

    2016-11-01

    Full Text Available Spring Festival travel rush is a phenomenon in China that population travel intensively surges in a short time around Chinese Spring Festival. This phenomenon, which is a special one in the urbanization process of China, brings a large traffic burden and various kinds of social problems, thereby causing widespread public concern. This study investigates the spatial-temporal characteristics of Spring Festival travel rush in 2015 through time series analysis and complex network analysis based on multisource big travel data derived from Baidu, Tencent, and Qihoo. The main results are as follows: First, big travel data of Baidu and Tencent obtained from location-based services might be more accurate and scientific than that of Qihoo. Second, two travel peaks appeared at five days before and six days after the Spring Festival, respectively, and the travel valley appeared on the Spring Festival. The Spring Festival travel network at the provincial scale did not have small-world and scale-free characteristics. Instead, the travel network showed a multicenter characteristic and a significant geographic clustering characteristic. Moreover, some travel path chains played a leading role in the network. Third, economic and social factors had more influence on the travel network than geographical location factors. The problem of Spring Festival travel rush will not be effectively improved in a short time because of the unbalanced urban-rural development and the unbalanced regional development. However, the development of the modern high-speed transport system and the modern information and communication technology can alleviate problems brought by Spring Festival travel rush. We suggest that a unified real-time traffic platform for Spring Festival travel rush should be established through the government's integration of mobile big data and the official authority data of the transportation department.

  12. Principal component analysis based unsupervised feature extraction applied to budding yeast temporally periodic gene expression.

    Science.gov (United States)

    Taguchi, Y-H

    2016-01-01

    The recently proposed principal component analysis (PCA) based unsupervised feature extraction (FE) has successfully been applied to various bioinformatics problems ranging from biomarker identification to the screening of disease causing genes using gene expression/epigenetic profiles. However, the conditions required for its successful use and the mechanisms involved in how it outperforms other supervised methods is unknown, because PCA based unsupervised FE has only been applied to challenging (i.e. not well known) problems. In this study, PCA based unsupervised FE was applied to an extensively studied organism, i.e., budding yeast. When applied to two gene expression profiles expected to be temporally periodic, yeast metabolic cycle (YMC) and yeast cell division cycle (YCDC), PCA based unsupervised FE outperformed simple but powerful conventional methods, with sinusoidal fitting with regards to several aspects: (i) feasible biological term enrichment without assuming periodicity for YMC; (ii) identification of periodic profiles whose period was half as long as the cell division cycle for YMC; and (iii) the identification of no more than 37 genes associated with the enrichment of biological terms related to cell division cycle for the integrated analysis of seven YCDC profiles, for which sinusoidal fittings failed. The explantation for differences between methods used and the necessary conditions required were determined by comparing PCA based unsupervised FE with fittings to various periodic (artificial, thus pre-defined) profiles. Furthermore, four popular unsupervised clustering algorithms applied to YMC were not as successful as PCA based unsupervised FE. PCA based unsupervised FE is a useful and effective unsupervised method to investigate YMC and YCDC. This study identified why the unsupervised method without pre-judged criteria outperformed supervised methods requiring human defined criteria.

  13. Visual cluster analysis and pattern recognition template and methods

    Science.gov (United States)

    Osbourn, Gordon Cecil; Martinez, Rubel Francisco

    1999-01-01

    A method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  14. Improving hierarchical clustering of genotypic data via principal component analysis

    NARCIS (Netherlands)

    Odong, T.L.; Heerwaarden, van J.; Hintum, van T.J.L.; Eeuwijk, van F.A.; Jansen, J.

    2013-01-01

    Understanding the genetic structure of germplasm collections is a prerequisite for effective and efficient use of crop genetic resources in genebanks. Currently, hierarchical clustering techniques are most popular for describing genetic structure in germplasm collections. Traditionally performed

  15. Visual cluster analysis and pattern recognition template and methods

    Energy Technology Data Exchange (ETDEWEB)

    Osbourn, G.C.; Martinez, R.F.

    1993-12-31

    This invention is comprised of a method of clustering using a novel template to define a region of influence. Using neighboring approximation methods, computation times can be significantly reduced. The template and method are applicable and improve pattern recognition techniques.

  16. Cluster decay analysis and related structure effects of fissionable ...

    Indian Academy of Sciences (India)

    2015-08-01

    Aug 1, 2015 ... Keywords. Collective clusterization; deformations and orientations; fission; heavy and superheavy nuclei. ... Author Affiliations. Manoj K Sharma1 Gurvinder Kaur1. School of Physics and Materials Science, Thapar University, Patiala 147 004, India ...

  17. The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

    Science.gov (United States)

    Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

    2018-04-01

    The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.

  18. Elemental analysis using temporal gating of a pulsed neutron generator

    Science.gov (United States)

    Mitra, Sudeep

    2018-02-20

    Technologies related to determining elemental composition of a sample that comprises fissile material are described herein. In a general embodiment, a pulsed neutron generator periodically emits bursts of neutrons, and is synchronized with an analyzer circuit. The bursts of neutrons are used to interrogate the sample, and the sample outputs gamma rays based upon the neutrons impacting the sample. A detector outputs pulses based upon the gamma rays impinging upon the material of the detector, and the analyzer circuit assigns the pulses to temporally-based bins based upon the analyzer circuit being synchronized with the pulsed neutron generator. A computing device outputs data that is indicative of elemental composition of the sample based upon the binned pulses.

  19. A spatio-temporal analysis of fires in South Africa

    Directory of Open Access Journals (Sweden)

    Sheldon Strydom

    2016-11-01

    Full Text Available The prevalence and history of fires in Africa has led to the continent being named "the fire continent". Fires are common on the continent and lead to a high number of annual fire disasters which result in many human fatalities and considerable financial loss. Increased population growth and concentrated settlement planning increase the probability of fire disasters and the associated loss of human life and financial loss when disasters occur. In order to better understand the spatial and temporal variations and characteristics of fires in South Africa, an 11-year data set of MODIS-derived Active Fire Hotspots was analysed using an open source geographic information system. The study included the mapping of national fire frequency over the 11-year period. Results indicate that the highest fire frequency occurred in the northeastern regions of South Africa, in particular the mountainous regions of KwaZulu-Natal and Mpumalanga, and in the Western Cape. Increasing trends in provincial fire frequency were observed in eight of the nine provinces of South Africa, with Mpumalanga the only province for which a decrease in annual fire frequency was observed. Temporally, fires were observed in all months for all provinces, although distinct fire seasons were observed and were largely driven by rainfall seasons. The southwestern regions of South Africa (winter-rainfall regions experienced higher fire frequencies during the summer months and the rest of the country (summer-rainfall regions during the winter months. Certain regions those which experienced bimodal rainfall seasons did not display distinct fire seasons because of the complex wet and dry seasons. Investigation into the likely effects of climate change on South African fire frequency revealed that increased air temperatures and events such as La Niña have a marked effect on fire activity.

  20. Cluster analysis of fruit and vegetable-related perceptions: an alternative approach of consumer segmentation.

    Science.gov (United States)

    Simunaniemi, A-M; Nydahl, M; Andersson, A

    2013-02-01

    Audience segmentation optimises health communication aimed to promote healthy dietary habits, such as fruit and vegetable (F&V) consumption. The present study aimed to segment respondents into clusters based on F&V-related perceptions, and to describe these clusters with respect to F&V consumption and sex. The cross-sectional study was conducted using a semi-structured questionnaire. The respondents were randomly selected among Swedish adults (n = 1304; response rate 51%; 56% women). A two-step cluster analysis was conducted followed by a binary logistic regression with cluster membership as a dependent variable. The clusters were compared using t-tests and chi-squared tests. P vegetables (both sexes) and fruit (women only), whereas men in the Indifferent cluster (n = 715) consumed more juice. Indifferent cluster reported more F&V consumption preventing factors, such as storage and preparation difficulties and low satisfaction with F&V selection and price. Not liking or not having a habit of F&V consumption, laziness, forgetting and a lack of time were mentioned as main barriers to F&V consumption. The Indifferent cluster reports more practical and life-style related difficulties. The Positive cluster consumes more vegetables, perceives fewer F&V-related difficulties, and looks for more dietary information. The findings confirm that cluster analysis is an appropriate way of identifying consumer subgroups for targeted health and nutrition communication. © 2012 The Authors. Journal of Human Nutrition and Dietetics © 2012 The British Dietetic Association Ltd.

  1. Global myeloma research clusters, output, and citations: a bibliometric mapping and clustering analysis.

    Directory of Open Access Journals (Sweden)

    Jens Peter Andersen

    Full Text Available International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases.We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005-2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013.Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care.

  2. Topic modeling for cluster analysis of large biological and medical datasets.

    Science.gov (United States)

    Zhao, Weizhong; Zou, Wen; Chen, James J

    2014-01-01

    The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting

  3. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

    Science.gov (United States)

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

    2017-11-01

    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease

  4. Method for exploratory cluster analysis and visualisation of single-trial ERP ensembles.

    Science.gov (United States)

    Williams, N J; Nasuto, S J; Saddy, J D

    2015-07-30

    The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. We propose a complete pipeline for the cluster analysis of ERP data. To increase the signal-to-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA) to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). After validating the pipeline on simulated data, we tested it on data from two experiments - a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership. Our analysis operates on denoised single-trials, the number of clusters are determined in a principled manner and the results are presented through an intuitive visualisation. Given the cluster structure in some experimental conditions, we suggest application of cluster analysis as a preliminary step before ensemble averaging. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin

    2017-01-01

    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  6. Periorbital melasma: Hierarchical cluster analysis of clinical features in Asian patients.

    Science.gov (United States)

    Jung, Y S; Bae, J M; Kim, B J; Kang, J-S; Cho, S B

    2017-11-01

    Studies have shown melasma lesions to be distributed across the face in centrofacial, malar, and mandibular patterns. Meanwhile, however, melasma lesions of the periorbital area have yet to be thoroughly described. We analyzed normal and ultraviolet light-exposed photographs of patients with melasma. The periorbital melasma lesions were measured according to anatomical reference points and a hierarchical cluster analysis was performed. The periorbital melasma lesions showed clinical features of fine and homogenous melasma pigmentation, involving both the upper and lower eyelids that extended to other anatomical sites with a darker and coarser appearance. The hierarchical cluster analysis indicated that patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. Significant differences between cluster 1 and cluster 2 were found in lateral distance and inferolateral distance, but not in medial distance and superior distance. Comparing the two clusters, patients in cluster 2 were found to be significantly older and more commonly accompanied by melasma lesions of the temple and medial cheek. Our hierarchical cluster analysis of periorbital melasma lesions demonstrated that Asian patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions.

    Science.gov (United States)

    Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey

    2017-09-01

    We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians

  8. Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACĂ

    2011-06-01

    Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.

  9. Analysis of Factors and Development Potential of Economic Clusters by Economic Activities in Mari El Republic

    Directory of Open Access Journals (Sweden)

    Viktor Aleksandrovich Golovin

    2017-12-01

    Full Text Available This article analyzes the factors that drive the development of economic clusters in Mari El Republic (Russia. This analysis allowed to reveal the potential of those clusters further development. I consider a shift-share method as one of the major methods to identify the factors that determine the expansion of economic clusters. The author proposes the modification of shift-share method using relative performance indicators to evaluate the intensity and qualitaty of clustering processes in the region. The article presents the results of empirical research of the economy of Mari El Republic by shift-share method (2005–2015 years in the context of economic activities according to the Federal State Statistics Service. After the analysis of three basic indicators, the leading and lagging economic activities were revealed for the period of 10 years. I paid special attention to the analysis of clustering potential of the Mari El Republic in the context of economic activities based on the Clustering Potential Index. This analysis shows promising economic activities and industries that may form cluster. The author discusses the compliance and possible conflicts of two methods used in the study. Further research of this field can focus on the of system analysis and identifying specific companies and production chains that form the basis of clustering

  10. Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Directory of Open Access Journals (Sweden)

    Martinez Fernando J

    2010-03-01

    Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.

  11. Cluster Analysis of Customer Reviews Extracted from Web Pages

    Directory of Open Access Journals (Sweden)

    S. Shivashankar

    2010-01-01

    Full Text Available As e-commerce is gaining popularity day by day, the web has become an excellent source for gathering customer reviews / opinions by the market researchers. The number of customer reviews that a product receives is growing at very fast rate (It could be in hundreds or thousands. Customer reviews posted on the websites vary greatly in quality. The potential customer has to read necessarily all the reviews irrespective of their quality to make a decision on whether to purchase the product or not. In this paper, we make an attempt to assess are view based on its quality, to help the customer make a proper buying decision. The quality of customer review is assessed as most significant, more significant, significant and insignificant.A novel and effective web mining technique is proposed for assessing a customer review of a particular product based on the feature clustering techniques, namely, k-means method and fuzzy c-means method. This is performed in three steps : (1Identify review regions and extract reviews from it, (2 Extract and cluster the features of reviews by a clustering technique and then assign weights to the features belonging to each of the clusters (groups and (3 Assess the review by considering the feature weights and group belongingness. The k-means and fuzzy c-means clustering techniques are implemented and tested on customer reviews extracted from web pages. Performance of these techniques are analyzed.

  12. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  13. The diamond model analysis of ICT cluster in Thailand

    Directory of Open Access Journals (Sweden)

    Danuvasin Charoen, Ph.D.

    2013-07-01

    Full Text Available Information and Communication Technology (ICT has become an integral part of national competitiveness. Thailand was ranked 38th (out of 134 countries in the global competitiveness report conducted by the World Economic Forum. It also was ranked well below the world average on all of the factors related to technology, despite the fact that information technology and telecommunications had been a major factor driving the competitiveness of the country. The main purpose of this study is to investigate the various issues related to ICT cluster in Thailand. The diamond model was used to analyze the ICT cluster in Thailand. The results from this study can be used to guide the policy to enhance the competitiveness of ICT cluster.

  14. Analysis of protein profiles using fuzzy clustering methods

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Ukendt, Sujatha; Rai, Lavanya

    clustering methods for their classification followed by various validation  measures.    The  clustering  algorithms  used  for  the  study  were  K-  means,  K- medoid, Fuzzy C-means, Gustafson-Kessel, and Gath-Geva.  The results presented in this study  conclude  that  the  protein  profiles  of  tissue......  samples  recorded  by  using  the  HPLC- LIF  system  and  the  data  analyzed  by  clustering  algorithms  quite  successfully  classifies them as belonging from normal and malignant conditions....

  15. Functional clustering algorithm for the analysis of dynamic network data

    Science.gov (United States)

    Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.

    2009-05-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  16. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

    Science.gov (United States)

    Ho, Hsuan-Fu; Hung, Chia-Chi

    2008-01-01

    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  17. Spatial and temporal analysis of lake sedimentation under reforestation

    Directory of Open Access Journals (Sweden)

    C.M. Pilgrim

    2015-10-01

    Full Text Available Spatial and temporal land cover changes can reduce or accelerate lake sedimentation. This study was conducted to examine morphometry and bathymetry, and the long-term changes (over 75 years in sedimentation in the Lake Issaqueena reservoir, South Carolina. The watershed and catchment areas were delineated using Light Detection and Ranging (LiDAR based data. Trends in lake surface area and riparian buffer condition (vegetated or unvegetated were determined from historical aerial photography. From 1938 to 2009, the lake experienced a decrease in surface area of approximately 11.33 ha while catchment area increased by 6.99 ha, and lake volume decreased by 320,800.00 m3. Lake surface area decreased in years corresponding to equal coverage or largely unvegetated riparian buffers. Surface area and average annual precipitation were not correlated; therefore other factors such as soil type, riparian buffer condition and changes in land use likely contributed to sedimentation. Shift from agricultural land to forestland in this watershed resulted in a decrease in sedimentation rates by 88.28%.

  18. Detecting spatial-temporal cluster of hand foot and mouth disease in Beijing, China, 2009-2014.

    Science.gov (United States)

    Qian, Haikun; Huo, Da; Wang, Xiaoli; Jia, Lei; Li, Xitai; Li, Jie; Gao, Zhiyong; Liu, Baiwei; Tian, Yi; Wu, Xiaona; Wang, Quanyi

    2016-05-17

    The incidence of hand, foot, and mouth disease (HFMD) is extremely high, and has constituted a huge disease burden throughout Beijing in recent years. This study aimed to determine the spatiotemporal distribution and epidemic characteristics of HFMD. Descriptive statistics was used to analyze the data and estimate the epidemic peaks in 2009-2014. Space-time scanning detected spatiotemporal clusters and identified high-risk locations. Global and local Moran's I statistics were used to measure the spatial autocorrelation. Geocoding was performed in ArcGIS, based on the present address codes of the patients and the centroids of the towns. Maps were created in ArcGIS to show the geographic spread of HFMD. In total, 220,451probable cases of HFMD were reported in Beijing between January 2009 and December 2014: 12,749 (5.78 %) were laboratory confirmed, and 35 (0.02 %) were fatal. The median age of reported cases was 3.12 years (interquartile range 1.96-4.39). Coxsackievirus A16 (CV-A16), enterovirus 71 (EV-A71), and other enteroviruses accounted for 39.31, 35.36, and 25.33 % of the 12,749 confirmed cases, respectively. Many more severe cases were caused by EV-A71 (χ (2) = 186.41, df = 1, P < 0.001) and other enteroviruses (χ (2) = 156.44, df = 1, P < 0.001) than by CV-A16. A large single distinct peak occurred between May and July each year. Spatiotemporal clusters of HFMD were identified in Beijing during 2009-2014. The most likely clusters were detected and tended to move from the southwest (Fengtai and Daxing) southeastwards to Daxing and Tongzhou in 2009-2014. The incidence of HFMD was not randomly distributed, but showed global and local spatial autocorrelations. There were obvious spatiotemporal clusters of HFMD in Beijing in 2009-2014. High-incidence areas mainly occurred at the junctions of urban and rural zones. More attention should be paid to the epidemiological and spatiotemporal characteristics of HFMD to establish new

  19. Hierarchical cluster analysis of ignitable liquids based on the total ion spectrum.

    Science.gov (United States)

    Waddell, Erin E; Frisch-Daiello, Jessica L; Williams, Mary R; Sigman, Michael E

    2014-09-01

    Gas chromatography-mass spectrometry (GC-MS) data of ignitable liquids in the Ignitable Liquids Reference Collection (ILRC) database were processed to obtain 445 total ion spectra (TIS), that is, average mass spectra across the chromatographic profile. Hierarchical cluster analysis, an unsupervised learning technique, was applied to find features useful for classification of ignitable liquids. A combination of the correlation distance and average linkage was utilized for grouping ignitable liquids with similar chemical composition. This study evaluated whether hierarchical cluster analysis of the TIS would cluster together ignitable liquids of the same ASTM class assignment, as designated in the ILRC database. The ignitable liquids clustered based on their chemical composition, and the ignitable liquids within each cluster were predominantly from one ASTM E1618-11 class. These results reinforce use of the TIS as a tool to aid in forensic fire debris analysis. © 2014 American Academy of Forensic Sciences.

  20. Cluster analysis in kinetic modelling of the brain: A noninvasive alternative to arterial sampling

    DEFF Research Database (Denmark)

    Liptrot, Matthew George; Adams, K.H.; Martiny, L.

    2004-01-01

    extracted from the PET data set. Hierarchical K-means cluster analysis was performed on the PET time series to extract a cerebral vasculature ROI. The number of clusters was varied from K = 1 to 10 for the second of the two-stage method. Determination of the correct number of clusters was performed...... blood sampling, the Simplified Reference Tissue Model (SRTM) and Logan analysis with cerebellar TAC as an input. There was a good agreement (P K-means-clustered input function and those from the arterial blood samples. This work......) extracted directly from dynamic positron emission tomography (PET) scans by cluster analysis. Five healthy subjects were injected with the 5HT2A- receptor ligand [18F]-altanserin and blood samples were subsequently taken from the radial artery and cubital vein. Eight regions-of-interest (ROI) TACs were...

  1. Analysis of protein profiles using fuzzy clustering methods

    DEFF Research Database (Denmark)

    Karemore, Gopal Raghunath; Ukendt, Sujatha; Rai, Lavanya

    clustering methods for their classification followed by various validation  measures.    The  clustering  algorithms  used  for  the  study  were  K-  means,  K- medoid, Fuzzy C-means, Gustafson-Kessel, and Gath-Geva.  The results presented in this study  conclude  that  the  protein  profiles  of  tissue...

  2. Analysis of the Advantages of Creating Border Clusters

    Directory of Open Access Journals (Sweden)

    Liudmila Rosca-Sadurschi

    2015-08-01

    Full Text Available In a changing environment and rapid globalization, competitiveness of a country or region depends increasingly more effective in innovation. The main challenge for research and innovation is to facilitate the networking of companies and research laboratories. These networks can take the form of a highly integrated cross-border economic group, but may consist of action to facilitate business linkages and inter-laboratory, or cross-border clusters. The creation of these clusters requires performing several conditions but bring significant benefits to all stakeholders.

  3. Dynamic analysis of clustered building structures using substructures methods

    International Nuclear Information System (INIS)

    Leimbach, K.R.; Krutzik, N.J.

    1989-01-01

    The dynamic substructure approach to the building cluster on a common base mat starts with the generation of Ritz-vectors for each building on a rigid foundation. The base mat plus the foundation soil is subjected to kinematic constraint modes, for example constant, linear, quadratic or cubic constraints. These constraint modes are also imposed on the buildings. By enforcing kinematic compatibility of the complete structural system on the basis of the constraint modes a reduced Ritz model of the complete cluster is obtained. This reduced model can now be analyzed by modal time history or response spectrum methods

  4. Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics

    Science.gov (United States)

    Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.

    2007-01-01

    We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…

  5. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    Science.gov (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  6. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  7. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    Science.gov (United States)

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  8. Distinct Phenotypes of Smokers with Fixed Airflow Limitation Identified by Cluster Analysis of Severe Asthma.

    Science.gov (United States)

    Konno, Satoshi; Taniguchi, Natsuko; Makita, Hironi; Nakamaru, Yuji; Shimizu, Kaoruko; Shijubo, Noriharu; Fuke, Satoshi; Takeyabu, Kimihiro; Oguri, Mitsuru; Kimura, Hirokazu; Maeda, Yukiko; Suzuki, Masaru; Nagai, Katsura; Ito, Yoichi M; Wenzel, Sally E; Nishimura, Masaharu

    2018-01-01

    Smoking may have multifactorial effects on asthma phenotypes, particularly in severe asthma. Cluster analysis has been applied to explore novel phenotypes, which are not based on any a priori hypotheses. To explore novel severe asthma phenotypes by cluster analysis when including smoking patients with asthma. We recruited a total of 127 subjects with severe asthma, including 59 current or ex-smokers, from our university hospital and its 29 affiliated hospitals/pulmonary clinics. Clinical variables obtained during a 2-day hospital stay were used for cluster analysis. After clustering using clinical variables, the sputum levels of 14 molecules were measured to biologically characterize the clinical clusters. Five clinical clusters, including two characterized by low forced expiratory volume in 1 second/forced vital capacity, were identified. When characteristics of smoking subjects in these two clusters were compared, there were marked differences between the two groups: one had high levels of circulating eosinophils, high immunoglobulin E levels, and a high sinus score, and the other was characterized by low levels of the same parameters. Sputum analysis revealed intriguing differences of cytokine/chemokine pattern in these two groups. The other three clusters were similar to those previously reported: young onset/atopic, nonsmoker/less eosinophilic, and female/obese. Key clinical variables were confirmed to be stable and consistent 3 years later. This study reveals two distinct phenotypes with potentially different biological pathways contributing to fixed airflow limitation in cigarette smokers with severe asthma.

  9. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution ...

  10. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    uovs

    Abstract. To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters:.

  11. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: A multiscale joinpoint regression analysis

    Directory of Open Access Journals (Sweden)

    Goovaerts Pierre

    2011-12-01

    Full Text Available Abstract Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007 of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated

  12. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: a multiscale joinpoint regression analysis.

    Science.gov (United States)

    Goovaerts, Pierre; Xiao, Hong

    2011-12-05

    Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a

  13. Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.

    Science.gov (United States)

    Conley, Samantha

    2017-12-01

    The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.

  14. FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.

    Science.gov (United States)

    Fu, Limin; Medico, Enzo

    2007-01-04

    Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. To this aim, existing clustering approaches, mainly developed in computer science, have been adapted to microarray data analysis. However, previous studies revealed that microarray datasets have very diverse structures, some of which may not be correctly captured by current clustering methods. We therefore approached the problem from a new starting point, and developed a clustering algorithm designed to capture dataset-specific structures at the beginning of the process. The clustering algorithm is named Fuzzy clustering by Local Approximation of MEmbership (FLAME). Distinctive elements of FLAME are: (i) definition of the neighborhood of each object (gene or sample) and identification of objects with "archetypal" features named Cluster Supporting Objects, around which to construct the clusters; (ii) assignment to each object of a fuzzy membership vector approximated from the memberships of its neighboring objects, by an iterative converging process in which membership spreads from the Cluster Supporting Objects through their neighbors. Comparative analysis with K-means, hierarchical, fuzzy C-means and fuzzy self-organizing maps (SOM) showed that data partitions generated by FLAME are not superimposable to those of other methods and, although different types of datasets are better partitioned by different algorithms, FLAME displays the best overall performance. FLAME is implemented, together with all the above-mentioned algorithms, in a C++ software with graphical interface for Linux and Windows, capable of handling very large datasets, named Gene Expression Data Analysis Studio (GEDAS), freely available under GNU General Public License. The FLAME algorithm has intrinsic advantages, such as the ability to capture non-linear relationships and non-globular clusters, the automated definition of the number of clusters, and the

  15. Forecasting Antarctic Sea Ice Concentrations Using Results of Temporal Mixture Analysis

    Science.gov (United States)

    Chi, Junhwa; Kim, Hyun-Cheol

    2016-06-01

    Sea ice concentration (SIC) data acquired by passive microwave sensors at daily temporal frequencies over extended areas provide seasonal characteristics of sea ice dynamics and play a key role as an indicator of global climate trends; however, it is typically challenging to study long-term time series. Of the various advanced remote sensing techniques that address this issue, temporal mixture analysis (TMA) methods are often used to investigate the temporal characteristics of environmental factors, including SICs in the case of the present study. This study aims to forecast daily SICs for one year using a combination of TMA and time series modeling in two stages. First, we identify temporally meaningful sea ice signatures, referred to as temporal endmembers, using machine learning algorithms, and then we decompose each pixel into a linear combination of temporal endmembers. Using these corresponding fractional abundances of endmembers, we apply a autoregressive model that generally fits all Antarctic SIC data for 1979 to 2013 to forecast SIC values for 2014. We compare our results using the proposed approach based on daily SIC data reconstructed from real fractional abundances derived from a pixel unmixing method and temporal endmember signatures. The proposed method successfully forecasts new fractional abundance values, and the resulting images are qualitatively and quantitatively similar to the reference data.

  16. High resolution analysis of temporal variation of airborne radionuclides

    International Nuclear Information System (INIS)

    Komura, K.; Yamaguchi, Y.; Manikandan, M.; Murata, Y.; Iida, T.; Moriizumi, J.

    2004-01-01

    One of the application of ultra low-background gamma spectrometry, we tried to measure temporal variation of airborne radionuclides at intervals of 1 to few hours in extreme case. Airborne radionuclides were collected on a filter paper made of quartz fiber at the Low Level Radioactivity Laboratory (LLRL), Kanazawa Univ. in Tatsunokuchi (since Nov. 2002), Hegra Island located 50 km from Noto peninsula (since Apr. 2003) to investigate influence of Asian continent and Shishiku plateau at 640 m above sea to know vertical difference (since Sep., 2003). Pb-210, Pb-212 and Be-7 were measured nondestructively by ultra low background Ge detectors in Ogoya Underground Laboratory (270 meter water Concentration of Rn-222 was monitored 1 hour intervals and wind direction and speed were recorded 10 min or 2 min intervals (Hegra Is.) as support data in data analyses. In the regular monitoring, sampling was made at 1-2 day (LLRL and Shishiku) or 1 week intervals (Hegra) to know daily and seasonal variations and similarity or difference between sampling locations. When drastic meteorological change, such as passage of front or typhoon, occurrence of inversion layer and snow fall etc., short sampling at 1-2 hours of intervals was conducted to find the corrlation with meteorological factors at single point or 2 points simultaneously. As a results, it was found that concentrations of Pb-210, Po-210, Pb-212 and Be-7 were found to vary very quickly in a short time (see Figure below) due mainly to horizontal or vertical mixing of air-masses. (authors)

  17. Spatio-temporal analysis of forest modeling in Mexico

    Directory of Open Access Journals (Sweden)

    Saira Y. Martínez-Santiago

    2017-01-01

    Full Text Available Hay consenso de que las acciones antropogénicas están degradando los ecosistemas a un ritmo alarmante. La modelación y las nuevas tecnologías, como las tecnologías de la información y de la comunicación ( TIC, se utilizan en modo creciente para tomar decisiones sobre el manejo y la conservación de los recursos naturales. En este trabajo se analizaron la evolución temporal y la distribución espacial de la producción científica en modelación forestal en México. De 1980 a 2015, 454 autores participaron en la publicación de 259 artículos en 37 revistas (84 % mexicanas, de las cuales 28 están indizadas en el Journal Citation Reports (JCR. Los trabajos sobre manejo forestal han sido los más relevantes, aunque tienen una importancia relativa a la baja, mientras que los de servicios ambientales y distribución potencial van ganando importancia. Los autores pertenecen a 89 instituciones, de las cuales 65 % son mexicanas. Durante el periodo analizado, el número de autores (y las colaboraciones y publicaciones incrementaron 12 y nueve veces, respectivamente. Estos incrementos coinciden con la evolución de las políticas normativas y el establecimiento y apoyo del Sistema Nacional de Investigadores. Las colaboraciones en la red actual de modelación forestal aún tienen gran potencial de crecimiento.

  18. Symbolic analysis of spatio-temporal systems: The measurement problem

    International Nuclear Information System (INIS)

    Brown, R.; Tang, Xianzhu; Tracy, E.R.

    1996-01-01

    We consider the problem of measuring physical quantities using time-series observations. The approach taken is to validate theoretical models which are derived heuristically or from first principles. The fitting of parameters in such models constitutes the measurement. This is a basic problem in measurement science and a wide array of tools are available. However, an important gap in the present toolkit exists when the system of interest, and hence the models used, exhibit chaotic or turbulent behavior. The development of reliable schemes for analyzing such signals is necessary before one can claim to have a quantitative understanding of the underlying physics. In experimental situations, the number of independently measured time-series is limited, but the number of dynamical degrees of freedom can be large. In addition, the signals of interest will typically be embedded in a noisy background. In the symbol statistics approach, the time-series is coarse-grained and converted into a long, symbol stream. The probability of occurrence of various symbol sequences of fixed length constitutes the symbol statistics. These statistics contain a wealth of information about the underlying dynamics and, as we shall discuss, can be used to validate models. Previously, we have applied this symbolic approach to low dimensional systems with great success. The symbol statistics are robust up to noise/signal ∼20%. At higher noise levels the symbol statistics are biased, but in a relatively simple manner. By including the noise characteristics into the model, we were able to use the symbol statistics to measure parameters even when signal/noise is ∼ O(1). More recently, we have extended the symbolic approach to spatio-temporal systems. We have considered both coupled-map lattices and the complex Ginzburg-Landau equation. This equation arises generically near the onset of instabilities

  19. Analysis of SWOT spatial and temporal samplings over continents

    Science.gov (United States)

    Biancamaria, Sylvain; Lamy, Alain; Mognard, Nelly

    2014-05-01

    The future Surface Water and Ocean Topography (SWOT) satellite mission, collaboratively developed by NASA, CNES and CSA, is a joint oceanography/continental hydrology mission planned for launch in 2020. In June 2013, a new SWOT orbit has been selected with a 77.6° inclination, a 21 days repeat cycle and a 891 km altitude. The main satellite payload (a Ka-band SAR Interferometer), will provide 2D maps of water elevation, mask and slope over two swaths, both having a 50 km extent. These two swaths will be separated by a 20 km nadir gap. Most of the studies concerning SWOT published since 2007 have considered a former orbit with a 78° inclination, 22 day repeat orbit and a 970 km altitude and a 60 km extent for each swath. None of them have studied the newly selected orbit and the impact of the 20 km nadir gap on the spatial coverage has not been much explored. The purpose of the work presented here is to investigate the spatial and temporal coverage given this new orbit and the actual swath extent (2*50 km swaths with the 20 km nadir gap in between) and compare it to the former SWOT configuration. It is shown that the new configuration will have almost no impact on the computation of monthly averages, however it will impact the spatial coverage. Because of the nadir gap, the orbit repeatitivity and the swaths extent, 3.6% of the continental surfaces in between 78°S and 78°N will never be observed by SWOT (which was previously equal to 2.2% with the former SWOT configuration). The equatorial regions will be the most impacted, as uncovered area could go up to ~14% locally, whereas it never exceeded 9% with the previous SWOT configuration.

  20. Analysis of spatio-temporal structures of the thermospheric density

    Science.gov (United States)

    Schmidt, Michael; Bloßfeld, Mathis; Erdogan, Eren; Meraner, Andrea

    2017-04-01

    The Earth's upper atmosphere comprising the thermosphere and the ionosphere exhibits a dynamically coupled non-linear system in terms of chemical and physical processes. The system also interacts with the magnetosphere as well as the lower atmosphere. Several stand-alone or coupled models have been developed to reveal the behaviour of atmospheric target parameters and their interactions such as the neutral and charged particle density of the thermosphere from different perspectives which are, for instance, based on pure physical or (semi) empirical models as well as data assimilative approaches combining available models with new set of observations. The thermospheric neutral density, for instance, plays a crucial role within the equation of motion of Earth orbiting objects at low altitudes since the drag force is one of the largest non-gravitational perturbations and a function of the thermospheric integral density. Besides, the density estimation is of critical consideration for re-entry operations, manoeuvre planning, collision avoidance, precise orbit determination (POD) and satellite lifetime planning. There exist several empirical thermospheric models, which have been used in satellite orbit determination, e.g. the JB2008 or the DTM2013 model. They all include different gas species and provide thermospheric temperature and density as functions of the instantaneous position in altitude, latitude and longitude, as well as the local solar time, solar and geomagnetic storm indices and the harmonics of the year's fraction. In this contribution we study the global spatial and temporal behaviour of the thermospheric density provided by the models JB2008 or the DTM2013. Based on these insights we set up a concept for an empirical model of the thermospheric density. In the future step appropriate model parameters will be estimated from high precise satellite laser ranging observations. This work is related to the DFG project INSIGHT (Interactions of Low

  1. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  2. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability.

    Science.gov (United States)

    Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R

    2016-11-01

    To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. © 2016 Associated Professional Sleep Societies, LLC.

  3. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

    Science.gov (United States)

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

    2015-01-01

    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  4. Arguments for a Cluster Analysis of Nasal Consonant Sequences of ...

    African Journals Online (AJOL)

    Bantu language scholars, have among other things, debated over the issue of whether nasal and consonant sequences (NC sequences) in various Bantu languages should be considered as clusters or single segments (prenasalised stops). This paper examines these sequences as they occur in Sukwa nouns. Sukwa is a ...

  5. Comparing clustering and pre-processing in taxonomy analysis

    NARCIS (Netherlands)

    Bonder, M.J.; Abeln, S.; Zaura, E.; Brandt, B.W.

    2012-01-01

    Motivation: Massively parallel sequencing allows for rapid sequencing of large numbers of sequences in just a single run. Thus, 16S ribosomal RNA (rRNA) amplicon sequencing of complex microbial communities has become possible. The sequenced 16S rRNA fragments (reads) are clustered into operational

  6. A method based on temporal concept analysis for detecting and profiling human trafficking suspects

    NARCIS (Netherlands)

    Poelmans, J.; Elzinga, P.; Viaene, S.; Dedene, G.; Hamza, M.H.

    2010-01-01

    Human trafficking and forced prostitution are a serious problem for the Amsterdam-Amstelland police (the Netherlands). In this paper, we present a method based on Temporal Concept Analysis for detecting and profiling human trafficking suspects. Using traditional Formal Concept Analysis, we first

  7. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

    Science.gov (United States)

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

    2005-05-01

    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

  8. Transient identification by clustering based on Integrated Deterministic and Probabilistic Safety Analysis outcomes

    International Nuclear Information System (INIS)

    Di Maio, Francesco; Vagnoli, Matteo; Zio, Enrico

    2016-01-01

    Highlights: • We develop an Integrated Deterministic and Probabilistic Safety Analysis (IDPSA). • We present a transient identification approach for retrieving IDPSA scenarios information. • We post-process the IDPSA scenarios for clustering Prime Implicants and Near Misses. • The approach is useful for an on-line cluster assignment of an unknown developing scenario. • We apply the approach to the accidental scenarios of a dynamic Steam Generator of a NPP. - Abstract: In this work, we present a transient identification approach that utilizes clustering for retrieving scenarios information from an Integrated Deterministic and Probabilistic Safety Analysis (IDPSA). The approach requires: (i) creation of a database of scenarios by IDPSA; (ii) scenario post-processing for clustering Prime Implicants (PIs), i.e., minimum combinations of failure events that are capable of leading the system into a fault state, and Near Misses, i.e., combinations of failure events that lead the system to a quasi-fault state; (iii) on-line cluster assignment of an unknown developing scenario. In the step (ii), we adopt a visual interactive method and risk-based clustering to identify PIs and Near Misses, respectively; in the on-line step (iii), to assign a scenario to a cluster we consider the sequence of events in the scenario and evaluate the Hamming similarity to the sequences of the previously clustered scenarios. The feasibility of the analysis is shown with respect to the accidental scenarios of a dynamic Steam Generator (SG) of a NPP.

  9. Temporal epilepsy lesions may be detected by the voxel-based quantitative analysis of brain FDG-PET images using an original block-matching normalization software.

    Science.gov (United States)

    Verger, Antoine; Yagdigul, Yalcin; Van Der Gucht, Axel; Poussier, Sylvain; Guedj, Eric; Maillard, Louis; Malandain, Grégoire; Hossu, Gabriela; Fay, Renaud; Karcher, Gilles; Marie, Pierre-Yves

    2016-05-01

    Statistical parametric mapping (SPM) provides useful voxel-by-voxel analyses of brain images from (18)F-fluorodesoxyglucose positron emission tomography (FDG-PET) after an initial step of spatial normalization through an anatomical template model. In the setting of the preoperative workup of patients with temporal epilepsy, this study aimed at assessing a block-matching (BM) normalization method, where most transformations are computed through small blocks, a principle that minimizes artefacts and overcomes additional image-filtering. Brain FDG-PET images from 31 patients with well-characterised temporal lobe epilepsy and among whom 22 had common mesial temporal lobe epilepsy were retrospectively analysed using both BM and conventional SPM normalization methods and with PET images from age-adjusted controls. Different threshold p values corrected for cluster volume were considered (0.01, 0.005, and 0.001). The use of BM provided equivalent values to those of SPM with regard to the overall volumes of temporal and extra-temporal hypometabolism, as well as similar sensitivity for detecting the involved temporal lobe, reaching 87 and 94 % for SPM and BM, respectively, at a threshold p value of 0.01. However, the ability to more accurately localize brain lesions within the mesial portion of the temporal lobe was a little higher with BM than with SPM with respective sensitivities reaching 78 % for BM and 45 % for SPM (p < 0.05). BM normalization compares well with conventional SPM for the voxel-based quantitative analysis of the FDG-PET images from temporal epilepsy patients. Further studies in different population are needed to determine whether BM is truly an accurate alternative to SPM in this setting.

  10. Improved RNA analysis for immediate autopsy of temporal bone soft tissues.

    Science.gov (United States)

    Lin, J; Kawano, H; Paparella, M M; Ho, S B

    1999-01-01

    RNA analysis is essential for understanding biological activities of a cell or tissue. Unfortunately, retrieval of RNA from existing archives of human temporal bones has proven extremely difficult due to degradation of RNA molecules. The major factors that contribute to degradation of RNA in specimens from autopsied temporal bones are tissue autolysis due to time elapsed before autopsy, and technical problems in processing the bones after harvest. We therefore focused on improving the survival of RNA in human temporal bones by shortening the time to autopsy and through modification of the processing technique by removing targeted tissues directly from the temporal bones and by avoiding time-consuming decalcification and celloidin-embedding. Eight temporal bones collected at immediate autopsies were used in this study. Representative mRNAs, ranging from high (MUC5B, physically unstable) to low (beta-actin, physically stable) molecular weights, and from abundant (MUC5B) to non-abundant (MUC1) RNA, were studied by in situ hybridization, Northern blot technique, or both. Using this modified protocol in autopsies performed up to 6 h after death, the existence of mRNAs was demonstrated in all bones studied. This improved method demonstrates the feasibility of the use of autopsied temporal bone tissues for RNA analysis.

  11. Integrating cross-scale analysis in the spatial and temporal domains for classification of behavioral movement

    Directory of Open Access Journals (Sweden)

    Ali Soleymani

    2014-06-01

    Full Text Available Since various behavioral movement patterns are likely to be valid within different, unique ranges of spatial and temporal scales (e.g., instantaneous, diurnal, or seasonal with the corresponding spatial extents, a cross-scale approach is needed for accurate classification of behaviors expressed in movement. Here, we introduce a methodology for the characterization and classification of behavioral movement data that relies on computing and analyzing movement features jointly in both the spatial and temporal domains. The proposed methodology consists of three stages. In the first stage, focusing on the spatial domain, the underlying movement space is partitioned into several zonings that correspond to different spatial scales, and features related to movement are computed for each partitioning level. In the second stage, concentrating on the temporal domain, several movement parameters are computed from trajectories across a series of temporal windows of increasing sizes, yielding another set of input features for the classification. For both the spatial and the temporal domains, the ``reliable scale'' is determined by an automated procedure. This is the scale at which the best classification accuracy is achieved, using only spatial or temporal input features, respectively. The third stage takes the measures from the spatial and temporal domains of movement, computed at the corresponding reliable scales, as input features for behavioral classification. With a feature selection procedure, the most relevant features contributing to known behavioral states are extracted and used to learn a classification model. The potential of the proposed approach is demonstrated on a dataset of adult zebrafish (Danio rerio swimming movements in testing tanks, following exposure to different drug treatments. Our results show that behavioral classification accuracy greatly increases when firstly cross-scale analysis is used to determine the best analysis scale, and

  12. Analysis of the temporal electric fields in lossy dielectric media

    DEFF Research Database (Denmark)

    McAllister, Iain Wilson; Crichton, George C

    1991-01-01

    The time-dependent electric fields associated with lossy dielectric media are examined. The analysis illustrates that, with respect to the basic time constant, these lossy media can take a considerable time to attain a steady-state condition. Time-dependent field enhancement factors are considered......, and inherent surface-charge densities quantified. The calculation of electrostatic forces on a free, lossy dielectric particle is illustrated. An extension to the basic analysis demonstrates that, on reversal of polarity, the resultant tangential field at the interface could play a decisive role...

  13. Skunk and raccoon rabies in the eastern United States: temporal and spatial analysis.

    Science.gov (United States)

    Guerra, Marta A; Curns, Aaron T; Rupprecht, Charles E; Hanlon, Cathleen A; Krebs, John W; Childs, James E

    2003-09-01

    Since 1981, an epizootic of raccoon rabies has spread throughout the eastern United States. A concomitant increase in reported rabies cases in skunks has raised concerns that an independent maintenance cycle of rabies virus in skunks could become established, affecting current strategies of wildlife rabies control programs. Rabies surveillance data from 1981 through 2000 obtained from the health departments of 11 eastern states were used to analyze temporal and spatial characteristics of rabies epizootics in each species. Spatial analysis indicated that epizootics in raccoons and skunks moved in a similar direction from 1990 to 2000. Temporal regression analysis showed that the number of rabid raccoons predicted the number of rabid skunks through time, with a 1-month lag. In areas where the raccoon rabies virus variant is enzootic, spatio-temporal analysis does not provide evidence that this rabies virus variant is currently cycling independently among skunks.

  14. A cluster of acute flaccid paralysis and cranial nerve dysfunction temporally associated with an outbreak of enterovirus D68 in children in Colorado, USA.

    Science.gov (United States)

    Messacar, Kevin; Schreiner, Teri L; Maloney, John A; Wallace, Adam; Ludke, Jan; Oberste, M Stephen; Nix, W Allan; Robinson, Christine C; Glodé, Mary P; Abzug, Mark J; Dominguez, Samuel R

    2015-04-25

    Clusters of acute flaccid paralysis or cranial nerve dysfunction in children are uncommon. We aimed to assess a cluster of children with acute flaccid paralysis and cranial nerve dysfunction geographically and temporally associated with an outbreak of enterovirus-D68 respiratory disease. We defined a case of neurological disease as any child admitted to Children's Hospital Colorado (Aurora, CO, USA) with acute flaccid paralysis with spinal-cord lesions involving mainly grey matter on imaging, or acute cranial nerve dysfunction with brainstem lesions on imaging, who had onset of neurological symptoms between Aug 1, 2014, and Oct 31, 2014. We used Poisson regression to assess whether the numbers of cases during the outbreak period were significantly greater than baseline case numbers from a historical control period (July 31, 2010, to July 31, 2014). 12 children met the case definition (median age 11·5 years [IQR 6·75-15]). All had a prodromal febrile illness preceding neurological symptoms by a median of 7 days (IQR 5·75-8). Neurological deficits included flaccid limb weakness (n=10; asymmetric n=7), bulbar weakness (n=6), and cranial nerve VI (n=3) and VII (n=2) dysfunction. Ten (83%) children had confluent, longitudinally extensive spinal-cord lesions of the central grey matter, with predominant anterior horn-cell involvement, and nine (75%) children had brainstem lesions. Ten (91%) of 11 children had cerebrospinal fluid pleocytosis. Nasopharyngeal specimens from eight (73%) of 11 children were positive for rhinovirus or enterovirus. Viruses from five (45%) of 11 children were typed as enterovirus D68. Enterovirus PCR of cerebrospinal fluid, blood, and rectal swabs, and tests for other causes, were negative. Improvement of cranial nerve dysfunction has been noted in three (30%) of ten children. All ten children with limb weakness have residual deficits. We report the first geographically and temporally defined cluster of acute flaccid paralysis and cranial

  15. Statistical analysis of long term spatial and temporal trends of ...

    Indian Academy of Sciences (India)

    The annual and seasonal trend analysis of different surface temperature parameters (average, maximum, minimum and diurnal temperature range) has been done for historical (1971–2005) and future periods (2011–2099) in the middle catchment of Sutlej river basin, India. The future time series of temperature data has ...

  16. Temporal Land Cover Analysis for Net Ecosystem Improvement

    Energy Technology Data Exchange (ETDEWEB)

    Ke, Yinghai; Coleman, Andre M.; Diefenderfer, Heida L.

    2013-04-09

    We delineated 8 watersheds contributing to previously defined river reaches within the 1,468-km2 historical floodplain of the tidally influenced lower Columbia River and estuary. We assessed land-cover change at the watershed, reach, and restoration site scales by reclassifying remote-sensing data from the National Oceanic and Atmospheric Administration Coastal Change Analysis Program’s land cover/land change product into forest, wetland, and urban categories. The analysis showed a 198.3 km2 loss of forest cover during the first 6 years of the Columbia Estuary Ecosystem Restoration Program, 2001–2006. Total measured urbanization in the contributing watersheds of the estuary during the full 1996-2006 change analysis period was 48.4 km2. Trends in forest gain/loss and urbanization differed between watersheds. Wetland gains and losses were within the margin of error of the satellite imagery analysis. No significant land cover change was measured at restoration sites, although it was visible in aerial imagery, therefore, the 30-m land-cover product may not be appropriate for assessment of early-stage wetland restoration. These findings suggest that floodplain restoration sites in reaches downstream of watersheds with decreasing forest cover will be subject to increased sediment loads, and those downstream of urbanization will experience effects of increased impervious surfaces on hydrologic processes.

  17. Temporal precipitation trend analysis at the Langat River Basin ...

    Indian Academy of Sciences (India)

    The trends were determined at 30 rainfall stations using the Mann–Kendall (MK) test, the Sen's slope estimator and the linear regression analysis. Lag-1 approach was utilized to test the serial correlation of the series. On the annual scale, it was found that most of the stations in the basin were characterized with insignificant ...

  18. Temporal geospatial analysis of secondary school students’ examination performance

    Science.gov (United States)

    Nik Abd Kadir, ND; Adnan, NA

    2016-06-01

    Malaysia's Ministry of Education has improved the organization of the data to have the geographical information system (GIS) school database. However, no further analysis is done using geospatial analysis tool. Mapping has emerged as a communication tool and becomes effective way to publish the digital and statistical data such as school performance results. The objective of this study is to analyse secondary school student performance of science and mathematics scores of the Sijil Pelajaran Malaysia Examination result in the year 2010 to 2014 for the Kelantan's state schools with the aid of GIS software and geospatial analysis. The school performance according to school grade point average (GPA) from Grade A to Grade G were interpolated and mapped and query analysis using geospatial tools able to be done. This study will be beneficial to the education sector to analyse student performance not only in Kelantan but to the whole Malaysia and this will be a good method to publish in map towards better planning and decision making to prepare young Malaysians for the challenges of education system and performance.

  19. The Review of Visual Analysis Methods of Multi-modal Spatio-temporal Big Data

    Directory of Open Access Journals (Sweden)

    ZHU Qing

    2017-10-01

    Full Text Available The visual analysis of spatio-temporal big data is not only the state-of-art research direction of both big data analysis and data visualization, but also the core module of pan-spatial information system. This paper reviews existing visual analysis methods at three levels:descriptive visual analysis, explanatory visual analysis and exploratory visual analysis, focusing on spatio-temporal big data's characteristics of multi-source, multi-granularity, multi-modal and complex association.The technical difficulties and development tendencies of multi-modal feature selection, innovative human-computer interaction analysis and exploratory visual reasoning in the visual analysis of spatio-temporal big data were discussed. Research shows that the study of descriptive visual analysis for data visualizationis is relatively mature.The explanatory visual analysis has become the focus of the big data analysis, which is mainly based on interactive data mining in a visual environment to diagnose implicit reason of problem. And the exploratory visual analysis method needs a major break-through.

  20. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Rita Ismayilova

    2014-01-01

    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  1. Evaluation of secular trend and the existence of cases of clusters of bladder cancer in Goiania: descriptive study population-based; Avaliacao da tendencia temporal e da existencia de casos de clusters de cancer de bexiga em Goiania: estudo descritivo de base populacional

    Energy Technology Data Exchange (ETDEWEB)

    Antonio, Gisele Guimaraes Daflon

    2008-07-01

    More than 20 years after the radiological accident with cesium-137 in the city of Goiania, there is still a feeling in local population that the number of cases of cancer in the city is growing up due to the past radiation exposure and that the number of people contaminated or exposed was higher than the number reported. The present study aims to evaluate the temporal trend and the space-time distribution of bladder cancer cases in Goiania from 1988 and 2003, taking into account that bladder cancer presents the highest risk coefficients per unit of radiation dose among solid cancers. The study population was composed of all incident cases of bladder cancer registered in the Population-Based Cancer Registry of Goiania, between 1988 and 2003.Temporal trend of bladder cancer incidence was analyzed by sex and age groups ( < 60 and {>=} 60 years of age) through polynomial regression using age standardized incidence rates of bladder cancer (world population). SaTscan was used to determine whether statistical significant geographic clusters of high incidence of bladder cancer cases can be located in the city. The results showed a significant increase of bladder cancer incidence rates in males of all ages (p= 0.025) and for age group higher or equal to 60 years old (p=O.022), and a stability in trends for female sex. In the space-time analysis, a cluster was identified, however without statistical significance (p=0.278) and its location has no relationship with the main focuses of contamination of the radiological accident in 1987. We concluded that, despite of the increase of incidence rates in males, this can be explained by the improvement in diagnostic procedures throughout time, being this increase still not perceived in females considering the small number of cases. As chance can not be ruled out as the explanation of the identified cluster, we do not suggest any further detailed investigation in this cluster, as the occurrence of cluster diseases in space can occur

  2. Functional Analysis of the Fusarielin Biosynthetic Gene Cluster

    Directory of Open Access Journals (Sweden)

    Aida Droce

    2016-12-01

    Full Text Available Fusarielins are polyketides with a decalin core produced by various species of Aspergillus and Fusarium. Although the responsible gene cluster has been identified, the biosynthetic pathway remains to be elucidated. In the present study, members of the gene cluster were deleted individually in a Fusarium graminearum strain overexpressing the local transcription factor. The results suggest that a trans-acting enoyl reductase (FSL5 assists the polyketide synthase FSL1 in biosynthesis of a polyketide product, which is released by hydrolysis by a trans-acting thioesterase (FSL2. Deletion of the epimerase (FSL3 resulted in accumulation of an unstable compound, which could be the released product. A novel compound, named prefusarielin, accumulated in the deletion mutant of the cytochrome P450 monooxygenase FSL4. Unlike the known fusarielins from Fusarium, this compound does not contain oxygenized decalin rings, suggesting that FSL4 is responsible for the oxygenation.

  3. Nannoplankton from the Bombay-Saurashtra continental shelf of India: An appraisal using cluster analysis

    Digital Repository Service at National Institute of Oceanography (India)

    Guptha, M.V.S.; Nigam, R.

    Nannoplankton data from 28 stations in the northwestern continental shelf of India was subjected to Q-mode cluster analysis. Two biotopes A and B were identified. Although, Gephyrocapsa oceanica was by far, the most abundant species in both...

  4. Statistical Techniques Applied to Aerial Radiometric Surveys (STAARS): cluster analysis. National Uranium Resource Evaluation

    International Nuclear Information System (INIS)

    Pirkle, F.L.; Stablein, N.K.; Howell, J.A.; Wecksung, G.W.; Duran, B.S.

    1982-11-01

    One objective of the aerial radiometric surveys flown as part of the US Department of Energy's National Uranium Resource Evaluation (NURE) program was to ascertain the regional distribution of near-surface radioelement abundances. Some method for identifying groups of observations with similar radioelement values was therefore required. It is shown in this report that cluster analysis can identify such groups even when no a priori knowledge of the geology of an area exists. A method of convergent k-means cluster analysis coupled with a hierarchical cluster analysis is used to classify 6991 observations (three radiometric variables at each observation location) from the Precambrian rocks of the Copper Mountain, Wyoming, area. Another method, one that combines a principal components analysis with a convergent k-means analysis, is applied to the same data. These two methods are compared with a convergent k-means analysis that utilizes available geologic knowledge. All three methods identify four clusters. Three of the clusters represent background values for the Precambrian rocks of the area, and one represents outliers (anomalously high 214 Bi). A segmentation of the data corresponding to geologic reality as discovered by other methods has been achieved based solely on analysis of aerial radiometric data. The techniques employed are composites of classical clustering methods designed to handle the special problems presented by large data sets. 20 figures, 7 tables

  5. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways

    Science.gov (United States)

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques

    2008-01-01

    The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources. PMID:18524799

  6. Spatio-temporal analysis of wetland change in Port Harcourt ...

    African Journals Online (AJOL)

    Descriptive statistics were employed for data analysis. Finding shows that wetlands decreased from 150.17km2 to 42.70km2 (-87.5%) between 1984 and 2015. Thick vegetation and waterbodies decreased by 35.6% and 41.48% respectively while built up area increased from 81.63km2 to 205.89km2 between 1984 and ...

  7. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....

  8. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....

  9. Clustering applications in financial and economic analysis of the crop production in the Russian regions

    Directory of Open Access Journals (Sweden)

    Gromov Vladislav Vladimirovich

    2013-08-01

    Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.

  10. Subtypes of autism by cluster analysis based on structural MRI data.

    Science.gov (United States)

    Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas

    2005-05-01

    The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

  11. Fault Tree Analysis with Temporal Gates and Model Checking Technique for Qualitative System Safety Analysis

    International Nuclear Information System (INIS)

    Koh, Kwang Yong; Seong, Poong Hyun

    2010-01-01

    Fault tree analysis (FTA) has suffered from several drawbacks such that it uses only static gates and hence can not capture dynamic behaviors of the complex system precisely, and it is in lack of rigorous semantics, and reasoning process which is to check whether basic events really cause top events is done manually and hence very labor-intensive and time-consuming for the complex systems while it has been one of the most widely used safety analysis technique in nuclear industry. Although several attempts have been made to overcome this problem, they can not still do absolute or actual time modeling because they adapt relative time concept and can capture only sequential behaviors of the system. In this work, to resolve the problems, FTA and model checking are integrated to provide formal, automated and qualitative assistance to informal and/or quantitative safety analysis. Our approach proposes to build a formal model of the system together with fault trees. We introduce several temporal gates based on timed computational tree logic (TCTL) to capture absolute time behaviors of the system and to give concrete semantics to fault tree gates to reduce errors during the analysis, and use model checking technique to automate the reasoning process of FTA

  12. Identifying Subgroups of Tinnitus Using Novel Resting State fMRI Biomarkers and Cluster Analysis

    Science.gov (United States)

    2017-10-13

    applied to the resting-state data to identify tinnitus subgroups within the patient population and pair them with specific behavioral ...and behavioral data  Specific Aim 2: Determine tinnitus subgroups using automated cluster analysis of resting state data and associate the subgroups...data analysis and clustering method previously developed to apply to current tinnitus data set o Percentage of completion at end of Year 2 (24 months

  13. Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis

    Science.gov (United States)

    2015-01-01

    discussion of its application to the network of network scientists. Each partitioning step in this spectral scheme either bipartitions or tripartitions a...University of California Los Angeles Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis A dissertation...00-00-2015 to 00-00-2015 4. TITLE AND SUBTITLE Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis 5a

  14. Statistical parametric mapping analysis of the relationship between regional cerebral blood flow and symptom clusters of the depressive mood in patients with pre-dialytic chronic kidney disease

    International Nuclear Information System (INIS)

    Kim, Seong-Jang; Song, Sang Heon; Kim, Ji Hoon; Kwak, Ihm Soo

    2008-01-01

    The aim of this study is to investigate the relationship between regional cerebral blood flow (rCBF) and symptom clusters of depressive mood in pre-dialytic chronic kidney disease (CKD). Twenty-seven patients with stage 4-5 CKD were subjected to statistical parametric mapping analysis of brain single-photon emission computed tomography. Correlation analyses between separate symptom clusters of depressive mood and rCBF were done. The first factor (depressive mood) was negatively correlated with rCBF in the right insula, posterior cingulate gyrus, and left superior temporal gyrus, and positively correlated with rCBF in the left fusiform gyrus. The second factor (insomnia) was negatively correlated with rCBF in the right middle frontal gyrus, bilateral cingulate gyri, right insula, right putamen, and right inferior parietal lobule, and positively correlated with rCBF in left fusiform gyrus and bilateral cerebellar tonsils. The third factor (anxiety and psychomotor aspects) was negatively correlated with rCBF in the left inferior frontal gyms, right superior frontal gyms, right middle temporal gyrus, right superior temporal gyrus, and left superior frontal gyrus, and positively correlated with rCBF in the right ligual gyrus and right parahippocampal gyrus. In this study, the separate symptom clusters were correlated with specific rCBF patterns similar to those in major depressive disorder patients without CKD. However, some areas with discordant rCBF patterns were also noted when compared with major depressive disorder patients. Further larger scale investigations are needed. (author)

  15. FLOCK cluster analysis of mast cell event clustering by high-sensitivity flow cytometry predicts systemic mastocytosis.

    Science.gov (United States)

    Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty

    2015-11-01

    In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.

  16. PRINCIPAL COMPONENT ANALYSIS AND CLUSTER ANALYSIS IN MULTIVARIATE ASSESSMENT OF WATER QUALITY

    Directory of Open Access Journals (Sweden)

    Elzbieta Radzka

    2017-03-01

    Full Text Available This paper deals with the use of multivariate methods in drinking water analysis. During a five-year project, from 2008 to 2012, selected chemical parameters in 11 water supply networks of the Siedlce County were studied. Throughout that period drinking water was of satisfactory quality, with only iron and manganese ions exceeding the limits (21 times and 12 times, respectively. In accordance with the results of cluster analysis, all water networks were put into three groups of different water quality. A high concentration of chlorides, sulphates, and manganese and a low concentration of copper and sodium was found in the water of Group 1 supply networks. The water in Group 2 had a high concentration of copper and sodium, and a low concentration of iron and sulphates. The water from Group 3 had a low concentration of chlorides and manganese, but a high concentration of fluorides. Using principal component analysis and cluster analysis, multivariate correlation between the studied parameters was determined, helping to put water supply networks into groups according to similar water quality.

  17. Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and support-vector-machine-based clustering and classification techniques.

    Science.gov (United States)

    Yin, Zhong; Zhang, Jianhua

    2014-07-01

    Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  18. Analysis of temporal fluctuations in Bach’s sinfonias

    Science.gov (United States)

    Telesca, Luciano; Lovallo, Michele

    2012-06-01

    The correlation structures in 15 Bach's sinfonias were analyzed. Each sinfonia is characterized by the superposition of three voices. Each voice is a sequence of pitches. Each voice was transformed in a time series, in which the sampling time was given by the smallest pitch duration in that voice. The scaling properties of the three voices of each sinfonia was quantified by means of the estimate of the scaling exponent, performed using the power spectral density (PSD) and the detrended fluctuation analysis (DFA). The results show that the voice time series are persistent. The DFA was applied not only to any single voice time series, but also to couples (2-DFA) of voices and to the triple (3-DFA) of voices. It was found that the first voice of each sinfonia modulates the scaling behavior of the whole sinfonia.

  19. Sweet Spot Size in Virtual Sound Reproduction: A Temporal Analysis

    DEFF Research Database (Denmark)

    Lacouture Parodi, Yesenia; Rubak, Per

    2009-01-01

    The influence of head misalignments on the performance of binaural reproduction systems through loudspeakers is often evaluated in the frequency domain. The changes in magnitude give us an idea of how much of the crosstalk is leaked into the direct signal and therefore a sweet spot performance can......-correlation we estimate the interaural time delay and define a sweet spot. The analysis is based on measurements carried out on 21 different loudspeaker configurations, including two- and four-channels arrangements. Results show that closely spaced loudspeakers are more robust to lateral displacements than wider...... span angles. Additionally, the sweet spot as a function of head rotations increases systematically when the loudspeakers are placed at elevated positions....

  20. Statistical cluster analysis of the British Thoracic Society Severe refractory Asthma Registry: clinical outcomes and phenotype stability.

    Directory of Open Access Journals (Sweden)

    Chris Newby

    Full Text Available Severe refractory asthma is a heterogeneous disease. We sought to determine statistical clusters from the British Thoracic Society Severe refractory Asthma Registry and to examine cluster-specific outcomes and stability.Factor analysis and statistical cluster modelling was undertaken to determine the number of clusters and their membership (N = 349. Cluster-specific outcomes were assessed after a median follow-up of 3 years. A classifier was programmed to determine cluster stability and was validated in an independent cohort of new patients recruited to the registry (n = 245.Five clusters were identified. Cluster 1 (34% were atopic with early onset disease, cluster 2 (21% were obese with late onset disease, cluster 3 (15% had the least severe disease, cluster 4 (15% were the eosinophilic with late onset disease and cluster 5 (15% had significant fixed airflow obstruction. At follow-up, the proportion of subjects treated with oral corticosteroids increased in all groups with an increase in body mass index. Exacerbation frequency decreased significantly in clusters 1, 2 and 4 and was associated with a significant fall in the peripheral blood eosinophil count in clusters 2 and 4. Stability of cluster membership at follow-up was 52% for the whole group with stability being best in cluster 2 (71% and worst in cluster 4 (25%. In an independent validation cohort, the classifier identified the same 5 clusters with similar patient distribution and characteristics.Statistical cluster analysis can identify distinct phenotypes with specific outcomes. Cluster membership can be determined using a classifier, but when treatment is optimised, cluster stability is poor.

  1. Suicide in the oldest old: an observational study and cluster analysis.

    Science.gov (United States)

    Sinyor, Mark; Tan, Lynnette Pei Lin; Schaffer, Ayal; Gallagher, Damien; Shulman, Kenneth

    2016-01-01

    The older population are at a high risk for suicide. This study sought to learn more about the characteristics of suicide in the oldest-old and to use a cluster analysis to determine if oldest-old suicide victims assort into clinically meaningful subgroups. Data were collected from a coroner's chart review of suicide victims in Toronto from 1998 to 2011. We compared two age groups (65-79 year olds, n = 335, and 80+ year olds, n = 191) and then conducted a hierarchical agglomerative cluster analysis using Ward's method to identify distinct clusters in the 80+ group. The younger and older age groups differed according to marital status, living circumstances and pattern of stressors. The cluster analysis identified three distinct clusters in the 80+ group. Cluster 1 was the largest (n = 124) and included people who were either married or widowed who had significantly more depression and somewhat more medical health stressors. In contrast, cluster 2 (n = 50) comprised people who were almost all single and living alone with significantly less identified depression and slightly fewer medical health stressors. All members of cluster 3 (n = 17) lived in a retirement residence or nursing home, and this group had the highest rates of depression, dementia, other mental illness and past suicide attempts. This is the first study to use the cluster analysis technique to identify meaningful subgroups among suicide victims in the oldest-old. The results reveal different patterns of suicide in the older population that may be relevant for clinical care. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Clustered Xenopus keratin genes: A genomic, transcriptomic, and proteomic analysis.

    Science.gov (United States)

    Suzuki, Ken-Ichi T; Suzuki, Miyuki; Shigeta, Mitsuki; Fortriede, Joshua D; Takahashi, Shuji; Mawaribuchi, Shuuji; Yamamoto, Takashi; Taira, Masanori; Fukui, Akimasa

    2017-06-15

    Keratin genes belong to the intermediate filament superfamily and their expression is altered following morphological and physiological changes in vertebrate epithelial cells. Keratin genes are divided into two groups, type I and II, and are clustered on vertebrate genomes, including those of Xenopus species. Various keratin genes have been identified and characterized by their unique expression patterns throughout ontogeny in Xenopus laevis; however, compilation of previously reported and newly identified keratin genes in two Xenopus species is required for our further understanding of keratin gene evolution, not only in amphibians but also in all terrestrial vertebrates. In this study, 120 putative type I and II keratin genes in total were identified based on the genome data from two Xenopus species. We revealed that most of these genes are highly clustered on two homeologous chromosomes, XLA9_10 and XLA2 in X. laevis, and XTR10 and XTR2 in X. tropicalis, which are orthologous to those of human, showing conserved synteny among tetrapods. RNA-Seq data from various embryonic stages and adult tissues highlighted the unique expression profiles of orthologous and homeologous keratin genes in developmental stage- and tissue-specific manners. Moreover, we identified dozens of epidermal keratin proteins from the whole embryo, larval skin, tail, and adult skin using shotgun proteomics. In light of our results, we discuss the radiation, diversification, and unique expression of the clustered keratin genes, which are closely related to epidermal development and terrestrial adaptation during amphibian evolution, including Xenopus speciation. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Clustering-based analysis for residential district heating data

    DEFF Research Database (Denmark)

    Gianniou, Panagiota; Liu, Xiufeng; Heller, Alfred

    2018-01-01

    residential heating consumption data and evaluate information included in national building databases. The proposed method uses the K-means algorithm to segment consumption groups based on consumption intensity and representative patterns and ranks the groups according to daily consumption. This paper also......The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze...

  4. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  5. A cluster analysis of Basic Personality Inventory (BPI) adolescent profiles.

    Science.gov (United States)

    Bonynge, E R

    1994-03-01

    Basic Personality Inventory profiles of 95 male and 118 female adolescent admissions to a crisis intervention unit were subjected to a cluster analytic procedure. For both males and females, four subgroups were identified: Mental Health Maladjustment, Interpersonal Maladjustment, High-risk Rebellion, and Adjustment. Subgroups differed significantly on alternative markers of psychopathology (SCL-90-R and Diagnoses). Subgroups identified were consistent with groupings identified previously. The subgroups also corresponded with broad-band syndromes that are conventional within the literature on adolescent psychopathology. Subgroup characteristics and implications for adolescent assessment are discussed.

  6. Student academic performance analysis using fuzzy C-means clustering

    Science.gov (United States)

    Rosadi, R.; Akamal; Sudrajat, R.; Kharismawan, B.; Hambali, Y. A.

    2017-01-01

    Grade Point Average (GPA) is commonly used as an indicator of academic performance. Academic performance evaluations is a basic way to evaluate the progression of student performance, when evaluating student’s academic performance, there are occasion where the student data is grouped especially when the amounts of data is large. Thus, the pattern of data relationship within and among groups can be revealed. Grouping data can be done by using clustering method, where one of the methods is the Fuzzy C-Means algorithm. Furthermore, this algorithm is then applied to a set of student data form the Faculty of Mathematics and Natural Sciences, Padjadjaran University.

  7. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    Science.gov (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  8. Design and analysis of clinical trials with clustering effects due to treatment.

    Science.gov (United States)

    Roberts, Chris; Roberts, Stephen A

    2005-01-01

    Where patients receive therapy as a group, there are good theoretical reasons to believe that variation in the outcome will be smaller for patients treated in the same group than for patients treated in different groups. Similarly, where different therapists treat different groups of patients, outcome for patients treated by the same therapist may differ less than outcome for patients treated by different therapists. Clinical trials evaluating such therapies need to consider this potential lack of independence. As with cluster-randomized trials, this has implications for the precision of treatment effects estimates and statistical power. There are nevertheless differences between clustering due to the organization of treatment and that due to randomization. In cluster-randomized trials the distribution of cluster sizes in each treatment arm should be similar as a consequence of randomization unless there is differential loss to follow-up. With clustering due to therapy group or therapist, cluster size may differ systematically between treatment arms, due to size of therapy groups or differing health professional caseload. Intra-cluster correlation may also differ between treatment arms. The implications of differential cluster size and intracluster correlation for design and analysis will be illustrated by data from two trials, the first comparing nurse practitioner care with general practitioner care, and the second comparing a group therapy with individual treatment as usual. The special case where a group therapy or therapist is compared with an unclustered treatment is examined in detail using a simulation study. The implications of differential clustering effects for sample size and power are addressed. It is argued that the design and analysis of this type of trial should take account of possible heterogeneity in cluster size and intracluster correlation.

  9. Temporal analysis of social networks using three-way DEDICOM.

    Energy Technology Data Exchange (ETDEWEB)

    Bader, Brett William; Harshman, Richard A. (University of Ontario, London, Ontario, Canada); Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)

    2006-06-01

    DEDICOM is an algebraic model for analyzing intrinsically asymmetric relationships, such as the balance of trade among nations or the flow of information among organizations or individuals. It provides information on latent components in the data that can be regarded as ''properties'' or ''aspects'' of the objects, and it finds a few patterns that can be combined to describe many relationships among these components. When we apply this technique to adjacency matrices arising from directed graphs, we obtain a smaller graph that gives an idealized description of its patterns. Three-way DEDICOM is a higher-order extension of the model that has certain uniqueness properties. It allows for a third mode of the data, such as time, and permits the analysis of semantic graphs. We present an improved algorithm for computing three-way DEDICOM on sparse data and demonstrate it by applying it to the adjacency tensor of a semantic graph with time-labeled edges. Our application uses the Enron email corpus, from which we construct a semantic graph corresponding to email exchanges among Enron personnel over a series of 44 months. Meaningful patterns are recovered in which the representation of asymmetries adds insight into the social networks at Enron.

  10. Spatial temporal analysis of urban heat hazard in Tangerang City

    Science.gov (United States)

    Wibowo, Adi; Kuswantoro; Ardiansyah; Rustanto, Andry; Putut Ash Shidiq, Iqbal

    2016-11-01

    Urban heat is a natural phenomenon which might caused by human activities. The human activities were represented by various types of land-use such as urban and non-urban area. The aim of this study is to identify the urban heat behavior in Tangerang City as it might threats the urban environment. This study used three types of remote sensing data namely, Landsat TM, Landsat ETM+ and Landsat OLI-TIRS, to capture the urban heat behavior and to analysis the urban heat signature of Tangerang City in 2001, 2012, 2013, 2014, 2015 and 2016. The result showed that urban heat signature change dynamically each month based on the sun radiation. The urban heat island covered only small part of Tangerang City in 2001, but it was significantly increased and reached 50% of the area in 2012. Based on the result on urban heat signature, the threshold for threatening condition is 30 oC which recognized from land surface temperature (LST). The effective temperature (ET) index explains that condition as warm, uncomfortable, increase stress due to sweating and blood flow and may causing cardiovascular disorder.

  11. Detection of secondary structure elements in proteins by hydrophobic cluster analysis.

    Science.gov (United States)

    Woodcock, S; Mornon, J P; Henrissat, B

    1992-10-01

    Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.

  12. Point Cluster Analysis Using a 3D Voronoi Diagram with Applications in Point Cloud Segmentation

    Directory of Open Access Journals (Sweden)

    Shen Ying

    2015-08-01

    Full Text Available Three-dimensional (3D point analysis and visualization is one of the most effective methods of point cluster detection and segmentation in geospatial datasets. However, serious scattering and clotting characteristics interfere with the visual detection of 3D point clusters. To overcome this problem, this study proposes the use of 3D Voronoi diagrams to analyze and visualize 3D points instead of the original data item. The proposed algorithm computes the cluster of 3D points by applying a set of 3D Voronoi cells to describe and quantify 3D points. The decompositions of point cloud of 3D models are guided by the 3D Voronoi cell parameters. The parameter values are mapped from the Voronoi cells to 3D points to show the spatial pattern and relationships; thus, a 3D point cluster pattern can be highlighted and easily recognized. To capture different cluster patterns, continuous progressive clusters and segmentations are tested. The 3D spatial relationship is shown to facilitate cluster detection. Furthermore, the generated segmentations of real 3D data cases are exploited to demonstrate the feasibility of our approach in detecting different spatial clusters for continuous point cloud segmentation.

  13. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

    Directory of Open Access Journals (Sweden)

    Chao Zhang

    2017-09-01

    Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  14. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.

    Science.gov (United States)

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

    2017-09-27

    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  15. Using the Cluster Analysis and the Principal Component Analysis in Evaluating the Quality of a Destination

    Directory of Open Access Journals (Sweden)

    Ida Vajčnerová

    2016-01-01

    Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.

  16. Crowd Analysis by Using Optical Flow and Density Based Clustering

    DEFF Research Database (Denmark)

    Santoro, Francesco; Pedro, Sergio; Tan, Zheng-Hua

    2010-01-01

    In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step, it is a......In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step......, it is applied a crowd tracker in every frame, allowing us to detect and track the crowds. Our system gives the output as a graphic overlay, i.e it adds arrows and colors to the original frame sequence, in order to identify crowds and their movements. For the evaluation, we check when our system detect certains...

  17. Fuzzy subtractive clustering based prediction model for brand association analysis

    Directory of Open Access Journals (Sweden)

    Widodo Imam Djati

    2018-01-01

    Full Text Available The brand is one of the crucial elements that determine the success of a product. Consumers in determining the choice of a product will always consider product attributes (such as features, shape, and color, however consumers are also considering the brand. Brand will guide someone to associate a product with specific attributes and qualities. This study was designed to identify the product attributes and predict brand performance with those attributes. A survey was run to obtain the attributes affecting the brand. Subtractive Fuzzy Clustering was used to classify and predict product brand association based aspects of the product under investigation. The result indicates that the five attributes namely shape, ease, image, quality and price can be used to classify and predict the brand. Training step gives best FSC model with radii (ra = 0.1. It develops 70 clusters/rules with MSE (Training is 9.7093e-016. By using 14 data testing, the model can predict brand very well (close to the target with MSE is 0.6005 and its’ accuracy rate is 71%.

  18. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    Science.gov (United States)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  19. Cluster cosmological analysis with X ray instrumental observables: introduction and testing of AsPIX method

    International Nuclear Information System (INIS)

    Valotti, Andrea

    2016-01-01

    Cosmology is one of the fundamental pillars of astrophysics, as such it contains many unsolved puzzles. To investigate some of those puzzles, we analyze X-ray surveys of galaxy clusters. These surveys are possible thanks to the bremsstrahlung emission of the intra-cluster medium. The simultaneous fit of cluster counts as a function of mass and distance provides an independent measure of cosmological parameters such as Ω_m, σ_s, and the dark energy equation of state w0. A novel approach to cosmological analysis using galaxy cluster data, called top-down, was developed in N. Clerc et al. (2012). This top-down approach is based purely on instrumental observables that are considered in a two-dimensional X-ray color-magnitude diagram. The method self-consistently includes selection effects and scaling relationships. It also provides a means of bypassing the computation of individual cluster masses. My work presents an extension of the top-down method by introducing the apparent size of the cluster, creating a three-dimensional X-ray cluster diagram. The size of a cluster is sensitive to both the cluster mass and its angular diameter, so it must also be included in the assessment of selection effects. The performance of this new method is investigated using a Fisher analysis. In parallel, I have studied the effects of the intrinsic scatter in the cluster size scaling relation on the sample selection as well as on the obtained cosmological parameters. To validate the method, I estimate uncertainties of cosmological parameters with MCMC method Amoeba minimization routine and using two simulated XMM surveys that have an increasing level of complexity. The first simulated survey is a set of toy catalogues of 100 and 10000 deg"2, whereas the second is a 1000 deg"2 catalogue that was generated using an Aardvark semi-analytical N-body simulation. This comparison corroborates the conclusions of the Fisher analysis. In conclusion, I find that a cluster diagram that accounts for

  20. Ultrasonic motion analysis system - measurement of temporal and spatial gait parameters

    NARCIS (Netherlands)

    Huitema, RB; Hof, AL; Postema, K

    The duration of stance and swing phase and step and stride length are important parameters in human gait. In this technical note a low-cost ultrasonic motion analysis system is described that is capable of measuring these temporal and spatial parameters while subjects walk on the floor. By using the

  1. Similarity, Clustering, and Scaling Analyses for the Foreign Exchange Market ---Comprehensive Analysis on States of Market Participants with High-Frequency Financial Data---

    Science.gov (United States)

    Sato, A.; Sakai, H.; Nishimura, M.; Holyst, J. A.

    This article proposes mathematical methods to quantify states of marketparticipants in the foreign exchange market (FX market) and conduct comprehensive analysis on behavior of market participants by means of high-frequency financial data. Based on econophysics tools and perspectives we study similarity measures for both rate movements and quotation activities among various currency pairs. We perform also clustering analysis on market states for observation days, and find scaling relationship between mean values of quotation activities and their standard deviations. Using these mathematical methods we can visualize states of the FX market comprehensively. Finally we conclude that states of market participants temporally vary due to both external and internal factors.

  2. Application of Cluster Analysis in Assessment of Dietary Habits of Secondary School Students

    Directory of Open Access Journals (Sweden)

    Zalewska Magdalena

    2014-12-01

    Full Text Available Maintenance of proper health and prevention of diseases of civilization are now significant public health problems. Nutrition is an important factor in the development of youth, as well as the current and future state of health. The aim of the study was to show the benefits of the application of cluster analysis to assess the dietary habits of high school students. The survey was carried out on 1,631 eighteen-year-old students in seven randomly selected secondary schools in Bialystok using a self-prepared anonymous questionnaire. An evaluation of the time of day meals were eaten and the number of meals consumed was made for the surveyed students. The cluster analysis allowed distinguishing characteristic structures of dietary habits in the observed population. Four clusters were identified, which were characterized by relative internal homogeneity and substantial variation in terms of the number of meals during the day and the time of their consumption. The most important characteristics of cluster 1 were cumulated food ration in 2 or 3 meals and long intervals between meals. Cluster 2 was characterized by eating the recommended number of 4 or 5 meals a day. In the 3rd cluster, students ate 3 meals a day with large intervals between them, and in the 4th they had four meals a day while maintaining proper intervals between them. In all clusters dietary mistakes occurred, but most of them were related to clusters 1 and 3. Cluster analysis allowed for the identification of major flaws in nutrition, which may include irregular eating and skipping meals, and indicated possible connections between eating patterns and disturbances of body weight in the examined population.

  3. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  4. MMPI-2: cluster analysis of personality profiles in perinatal depression—preliminary evidence.

    Science.gov (United States)

    Meuti, Valentina; Marini, Isabella; Grillo, Alessandra; Lauriola, Marco; Leone, Carlo; Giacchetti, Nicoletta; Aceti, Franca

    2014-01-01

    To assess personality characteristics of women who develop perinatal depression. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. The analysis identified three clusters of personality profile: two "clinical" clusters (1 and 3) and an "apparently common" one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a "psychasthenic" depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop "dysphoric" depression; the second cluster (46.5%) shows a normal profile with a "defensive" attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  5. Temporal and Spatial Analysis of the New World Screwworm (Cochliomyia hominivorax) in Darien and Embera, Panama (2001-2011).

    Science.gov (United States)

    Maxwell, M J; Subia, J; Abrego, J; Garabed, R; Xiao, N; Toribio, R E

    2017-06-01

    Larvae (maggots) of Cochliomyia hominivorax, the New World Screwworm fly, are voracious consumers of living flesh that have a negative economic impact by decreasing productivity, predisposing to other pathogens, and, in severe cases, causing death of domestic livestock. Screwworm caused extensive financial losses to the livestock industry in North America prior to its eradication. Sterile insect technique (SIT) was used to eradicate screwworm throughout North and Central America and continues to be the main tool to control it in eastern Panama. The goal of this study was to evaluate the temporal and spatial trends of screwworm myiasis cases reported in the Province of Darien and Comarca Embera (border with Colombia), Panama, from 2001 to 2011. We hypothesized that screwworm cases would vary seasonally and be spatially clustered near Colombia as a result of effective eradication strategies in Panama and the presence of an autochthonous population of flies in western Colombia. Temporal and spatial data were retrieved from COPEG-USDA records (Panama) and analysed by anova, Ripley's K function, discrete Poisson spatial statistic scan and Getis-Ord Gi*. No significant temporal trend was found, but cases were spatially distributed in four clusters. One cluster of cases occurred from 2001 to 2003 and was considered a focal temporal and spatial cluster. One cluster occurred in 2001 and 2007 indicating more rare outbreaks in an area with fewer cattle. The two remaining clusters contained cases from 2004 to 2011 and 2001 to 2011 suggesting regular breaks in the control barrier due to occasional failures of the SIT programme, difficulties implementing border quarantine strategies, livestock smuggling or the movement of infested wildlife. © 2015 Blackwell Verlag GmbH.

  6. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.

    Science.gov (United States)

    Li, Zhaonan; Xu, Xinyi; Shen, Junshan

    2017-11-10

    In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.

  7. A formal concept analysis approach to consensus clustering of multi-experiment expression data

    Science.gov (United States)

    2014-01-01

    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological

  8. Fuzzy ensemble clustering based on random projections for DNA microarray data analysis.

    Science.gov (United States)

    Avogadri, Roberto; Valentini, Giorgio

    2009-01-01

    Two major problems related the unsupervised analysis of gene expression data are represented by the accuracy and reliability of the discovered clusters, and by the biological fact that the boundaries between classes of patients or classes of functionally related genes are sometimes not clearly defined. The main goal of this work consists in the exploration of new strategies and in the development of new clustering methods to improve the accuracy and robustness of clustering results, taking into account the uncertainty underlying the assignment of examples to clusters in the context of gene expression data analysis. We propose a fuzzy ensemble clustering approach both to improve the accuracy of clustering results and to take into account the inherent fuzziness of biological and bio-medical gene expression data. We applied random projections that obey the Johnson-Lindenstrauss lemma to obtain several instances of lower dimensional gene expression data from the original high-dimensional ones, approximately preserving the information and the metric structure of the original data. Then we adopt a double fuzzy approach to obtain a consensus ensemble clustering, by first applying a fuzzy k-means algorithm to the different instances of the projected low-dimensional data and then by using a fuzzy t-norm to combine the multiple clusterings. Several variants of the fuzzy ensemble clustering algorithms are proposed, according to different techniques to combine the base clusterings and to obtain the final consensus clustering. We applied our proposed fuzzy ensemble methods to the gene expression analysis of leukemia, lymphoma, adenocarcinoma and melanoma patients, and we compared the results with other state of the art ensemble methods. Results show that in some cases, taking into account the natural fuzziness of the data, we can improve the discovery of classes of patients defined at bio-molecular level. The reduction of the dimension of the data, achieved through random

  9. Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.

    Science.gov (United States)

    Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J

    2017-12-01

    Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the

  10. Temporal transcriptomic analysis of Desulfovibrio vulgaris Hildenborough transition into stationary phase growth during electrondonor depletion

    Energy Technology Data Exchange (ETDEWEB)

    Clark, M.E.; He, Q.; He, Z.; Huang, K.H.; Alm, E.J.; Wan, X.-F.; Hazen, T.C.; Arkin, A.P.; Wall, J.D.; Zhou, J.-Z.; Fields, M.W.

    2006-08-01

    Desulfovibrio vulgaris was cultivated in a defined medium, and biomass was sampled for approximately 70 h to characterize the shifts in gene expression as cells transitioned from the exponential to the stationary phase during electron donor depletion. In addition to temporal transcriptomics, total protein, carbohydrate, lactate, acetate, and sulfate levels were measured. The microarray data were examined for statistically significant expression changes, hierarchical cluster analysis, and promoter element prediction and were validated by quantitative PCR. As the cells transitioned from the exponential phase to the stationary phase, a majority of the down-expressed genes were involved in translation and transcription, and this trend continued at the remaining times. There were general increases in relative expression for intracellular trafficking and secretion, ion transport, and coenzyme metabolism as the cells entered the stationary phase. As expected, the DNA replication machinery was down-expressed, and the expression of genes involved in DNA repair increased during the stationary phase. Genes involved in amino acid acquisition, carbohydrate metabolism, energy production, and cell envelope biogenesis did not exhibit uniform transcriptional responses. Interestingly, most phage-related genes were up-expressed at the onset of the stationary phase. This result suggested that nutrient depletion may affect community dynamics and DNA transfer mechanisms of sulfate-reducing bacteria via the phage cycle. The putative feoAB system (in addition to other presumptive iron metabolism genes) was significantly up-expressed, and this suggested the possible importance of Fe{sup 2+} acquisition under metal-reducing conditions. The expression of a large subset of carbohydrate-related genes was altered, and the total cellular carbohydrate levels declined during the growth phase transition. Interestingly, the D. vulgaris genome does not contain a putative rpoS gene, a common attribute

  11. Analysis of network clustering behavior of the Chinese stock market

    Science.gov (United States)

    Chen, Huan; Mai, Yong; Li, Sai-Ping

    2014-11-01

    Random Matrix Theory (RMT) and the decomposition of correlation matrix method are employed to analyze spatial structure of stocks interactions and collective behavior in the Shanghai and Shenzhen stock markets in China. The result shows that there exists prominent sector structures, with subsectors including the Real Estate (RE), Commercial Banks (CB), Pharmaceuticals (PH), Distillers&Vintners (DV) and Steel (ST) industries. Furthermore, the RE and CB subsectors are mostly anti-correlated. We further study the temporal behavior of the dataset and find that while the sector structures are relatively stable from 2007 through 2013, the correlation between the real estate and commercial bank stocks shows large variations. By employing the ensemble empirical mode decomposition (EEMD) method, we show that this anti-correlation behavior is closely related to the monetary and austerity policies of the Chinese government during the period of study.

  12. Neutronic analysis of the KSTAR tokamak using Beowulf cluster

    International Nuclear Information System (INIS)

    Park, Jeong Hwan; Cho, Nam Zin; Kim, Jinchoon

    2000-01-01

    High-beta, beam-heated deuterium plasmas in KSTAR (Korea Superconducting Tokamak Advanced Research) will produce a peak neutron yield of 3.5x10 16 per second. Two equally probable D-D fusion reactions occur in deuterium plasma, one producing 2.45 MeV neutrons, and the other producing tritons which are confined in the plasma and undergo D-T reactions producing 14.1 MeV neutrons which are about 3 percent of the 2.45 MeV neutrons. The biological dose, nuclear heating of the cryogenically cooled magnets, and neutron activation of the surrounding materials have been investigated and their results are used for designing the KSTAR tokamak and the facility. In this work, the Beowulf cluster, Galaxy is used for intensive Monte-Carlo simulations and it is shown to be a cost effective parallel machine. (author)

  13. Prognostically distinct clinical patterns of systemic lupus erythematosus identified by cluster analysis.

    Science.gov (United States)

    To, C H; Mok, C C; Tang, S S K; Ying, S K Y; Wong, R W S; Lau, C S

    2009-12-01

    The objective of this study was to evaluate the patterns of clinical manifestations and their mortality in a large cohort of Chinese patients with systemic lupus erythematosus. The cumulative clinical manifestations of a large group of Chinese systemic lupus erythematosus patients who fulfilled at least four American College of Rheumatology criteria for systemic lupus erythematosus were studied. Patients were divided into distinct groups by using the K-mean cluster analysis. Clinical features, prevalence of proliferative lupus nephritis (World Health Organization class III, IV), autoantibody profile, and treatment data were compared and the standardized mortality ratios were calculated for each cluster of patients. There were 1082 patients included in the study (mean age at systemic lupus erythematosus diagnosis 30.5 years; mean systemic lupus erythematosus duration 10.3 years). Three distinct groups of patients were identified. Cluster 1 (n = 347) was characterized predominantly by mucocutaneous manifestations (malar rash, discoid rash, photosensitivity, oral ulcer) and arthritis but having the lowest prevalence of serositis, hematologic manifestations (hemolytic anemia, leukopenia, and thrombocytopenia), and proliferative lupus nephritis. Patients in cluster 2 (n = 409) had mainly renal and hematological manifestations but having the lowest prevalence of mucocutaneous manifestations. Pulmonary and gastrointestinal manifestations were significantly more frequent in cluster 2 than the other clusters. Cluster 3 patients (n = 326) had the most heterogeneous features. Besides having a high prevalence of mucocutaneous manifestations, serositis and hematologic manifestations, renal involvement, and proliferative lupus nephritis was also most prevalent among the three clusters. Patients in cluster 2 had a much higher standardized mortality ratio [standardized mortality ratio 7.23 (6.7-7.7), p lupus erythematosus could be clustered into prognostically distinct patterns of

  14. Analysis of O(2) adsorption on binary-alloy clusters of gold: energetics and correlations.

    Science.gov (United States)

    Joshi, Ajay M; Delgass, W Nicholas; Thomson, Kendall T

    2006-11-23

    We report a B3LYP density-functional theory (DFT) analysis of O(2) adsorption on 27 Au(n)M(m) (m, n = 0-3 and m + n = 2 or 3; M = Cu, Ag, Pd, Pt, and Na) clusters. The LANL2DZ pseudopotential and corresponding double-zeta basis set was used for heavy atoms, while a 6-311+G(3df) basis set was used for Na and O. We employed basis-set superposition error (BSSE) corrections in the electronic adsorption energies at 0 K (deltaE(ads)) and also calculated adsorption thermodynamics at standard conditions (298.15 K and 1 atm), i.e., internal energy of adsorption (deltaU(ads)) and Gibbs free energy of adsorption (deltaG(ads)). Natural Bond Orbital (NBO) analysis showed that all the clusters donated electron density to adsorbed O(2) and we successfully predicted intuitive linear correlations between the NBO charge on adsorbed O(2), O-O bond length, and O-O stretching frequency. Although there was no clear trend in the O(2) binding energy (BE = -deltaE(ads)) on pure and alloy dimers, we found the following interesting trend for trimers: BE (MAu(2)) clusters. The clusters having strongly electropositive Na atoms (e.g., Na(3) and Na(2)Au) donated almost one full electron to adsorbed O(2), and the BE is maximum on these clusters. Although O(2) dissociation is likely in such cases, we have restricted this study to trends in the adsorption of molecular O(2) only. We also found an approximate linear correlation between the charge transfer and BE versus energy difference between the bare-cluster HOMO and O(2) LUMOs, which we speculate to be a fundamental descriptor of the reactivity of small clusters toward O(2). Part of the scatter in these correlations is attributed to the differences in the O(2) binding orientations on different clusters (geometric effect). Relatively higher bare-cluster HOMO energy eases the charge transfer to adsorbed O(2) and enhances the reactivity toward O(2). The Frontier Orbital Picture (FOP) is not always useful in predicting the most favorable O(2) binding

  15. Salient concerns in using analgesia for cancer pain among outpatients: A cluster analysis study.

    Science.gov (United States)

    Meghani, Salimah H; Knafl, George J

    2017-02-10

    To identify unique clusters of patients based on their concerns in using analgesia for cancer pain and predictors of the cluster membership. This was a 3-mo prospective observational study ( n = 207). Patients were included if they were adults (≥ 18 years), diagnosed with solid tumors or multiple myelomas, and had at least one prescription of around-the-clock pain medication for cancer or cancer-treatment-related pain. Patients were recruited from two outpatient medical oncology clinics within a large health system in Philadelphia. A choice-based conjoint (CBC) analysis experiment was used to elicit analgesic treatment preferences (utilities). Patients employed trade-offs based on five analgesic attributes (percent relief from analgesics, type of analgesic, type of side-effects, severity of side-effects, out of pocket cost). Patients were clustered based on CBC utilities using novel adaptive statistical methods. Multiple logistic regression was used to identify predictors of cluster membership. The analyses found 4 unique clusters: Most patients made trade-offs based on the expectation of pain relief (cluster 1, 41%). For a subset, the main underlying concern was type of analgesic prescribed, i.e ., opioid vs non-opioid (cluster 2, 11%) and type of analgesic side effects (cluster 4, 21%), respectively. About one in four made trade-offs based on multiple concerns simultaneously including pain relief, type of side effects, and severity of side effects (cluster 3, 28%). In multivariable analysis, to identify predictors of cluster membership, clinical and socioeconomic factors (education, health literacy, income, social support) rather than analgesic attitudes and beliefs were found important; only the belief, i.e ., pain medications can mask changes in health or keep you from knowing what is going on in your body was found significant in predicting two of the four clusters [cluster 1 (-); cluster 4 (+)]. Most patients appear to be driven by a single salient concern

  16. A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

    OpenAIRE

    Li, Guohui; Zhang, Songling; Yang, Hong

    2017-01-01

    Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...

  17. Information search behaviour among new car buyers: A two-step cluster analysis

    Directory of Open Access Journals (Sweden)

    S.M. Satish

    2010-03-01

    Full Text Available A two-step cluster analysis of new car buyers in India was performed to identify taxonomies of search behaviour using personality and situational variables, apart from sources of information. Four distinct groups were found—broad moderate searchers, intense heavy searchers, low broad searchers, and low searchers. Dealers can identify the members of each segment by measuring the variables used for clustering, and can then design appropriate communication strategies.

  18. Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics

    Directory of Open Access Journals (Sweden)

    R. Padraic Springuel

    2007-12-01

    Full Text Available We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and written elements. The primary goal of this paper is to describe the methodology itself; we include a brief overview of relevant results.

  19. Galaxy Cluster Pressure Profiles as Determined by Sunyaev Zel’dovich Effect Observations with MUSTANG and Bolocam. II. Joint Analysis of 14 Clusters

    Science.gov (United States)

    Romero, Charles E.; Mason, Brian S.; Sayers, Jack; Mroczkowski, Tony; Sarazin, Craig; Donahue, Megan; Baldi, Alessandro; Clarke, Tracy E.; Young, Alexander H.; Sievers, Jonathan; Dicker, Simon R.; Reese, Erik D.; Czakon, Nicole; Devlin, Mark; Korngut, Phillip M.; Golwala, Sunil

    2017-04-01

    We present pressure profiles of galaxy clusters determined from high-resolution Sunyaev-Zel’dovich (SZ) effect observations of 14 clusters, which span the redshift range of 0.25MUSTANG and Bolocam data. In this analysis, we adopt the generalized NFW parameterization of pressure profiles to produce our models. Our constraints on ensemble-average pressure profile parameters, in this study γ, C 500, and P 0, are consistent with those in previous studies, but for individual clusters we find discrepancies with the X-ray derived pressure profiles from the ACCEPT2 database. We investigate potential sources of these discrepancies, especially cluster geometry, electron temperature of the intracluster medium, and substructure. We find that the ensemble mean profile for all clusters in our sample is described by the parameters [γ ,{C}500,{P}0]=[{0.3}-0.1+0.1,{1.3}-0.1+0.1,{8.6}-2.4+2.4], cool core clusters are described by [γ ,{C}500,{P}0] =[{0.6}-0.1+0.1,{0.9}-0.1+0.1,{3.6}-1.5+1.5], and disturbed clusters are described by [γ ,{C}500,{P}0]=[{0.0}-0.0+0.1,{1.5}-0.2+0.1,{13.8}-1.6+1.6]. Of the 14 clusters, 4 have clear substructure in our SZ observations, while an additional 2 clusters exhibit potential substructure.

  20. Measurement of temporal asymmetries of glucose consumption using linear profiles: reproducibility and comparison with visual analysis

    International Nuclear Information System (INIS)