Sample records for cluster enrichment analysis

  1. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering. (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing


    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  2. Pathway enrichment and co-expression cluster analysis - FANTOM5 | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us FANTOM...lusters File URL: Policy | Contact Us Pathway enrichment and co-expression cluster analysis - FANTOM5 | LSDB Archive ...

  3. Factor-cluster analysis and enrichment study of Mangrove sediments - An example from Mengkabong, Sabah

    International Nuclear Information System (INIS)

    Praveena, S.M.; Ahmed, A.; Radojevic, M.; Mohd Harun Abdullah; Aris, A.Z.


    This paper examines the tidal effects in the sediment of Mengkabong mangrove forest, Sabah. Generally, all the studied parameters showed high value at high tide compared to low tide. Factor-cluster analyses were adopted to allow the identification of controlling factors at high and low tides. Factor analysis extracted six controlling factors at high tide and seven controlling factors at low tide. Cluster analysis extracted two district clusters at high and low tides. The study showed that factor-cluster analysis application is a useful tool to single out the controlling factors at high and low tides. this will provide a basis for describing the tidal effects in the mangrove sediment. The salinity and electrical conductivity clusters as well as component loadings at high and low tide explained the tidal process where there is high contribution of seawater to mangrove sediments that controls the sediment chemistry. The geo accumulation index (T geo ) values suggest the mangrove sediments are having background concentrations for Al, Cu, Fe and Zn and unpolluted for Pb. (author)

  4. Clustering analysis

    International Nuclear Information System (INIS)



    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  5. Cluster analysis

    CERN Document Server

    Everitt, Brian S; Leese, Morven; Stahl, Daniel


    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  6. Cluster analysis


    Mucha, Hans-Joachim; Sofyan, Hizir


    As an explorative technique, duster analysis provides a description or a reduction in the dimension of the data. It classifies a set of observations into two or more mutually exclusive unknown groups based on combinations of many variables. Its aim is to construct groups in such a way that the profiles of objects in the same groups are relatively homogenous whereas the profiles of objects in different groups are relatively heterogeneous. Clustering is distinct from classification techniques, ...

  7. Implementing Enrichment Clusters in Elementary Schools: Lessons Learned (United States)

    Fiddyment, Gail E.


    Enrichment clusters offer a way for schools to encourage a high level of learning as students and adults work together to develop a product, service, or performance by applying advanced knowledge and authentic processes to real-world problems. This study utilized a qualitative research design to examine the perceptions and experiences of two…

  8. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R


    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  9. Marketing research cluster analysis

    Directory of Open Access Journals (Sweden)

    Marić Nebojša


    Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.

  10. Marketing research cluster analysis


    Marić Nebojša


    One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.


    International Nuclear Information System (INIS)

    Leaman, Ryan


    Star clusters are known to have smaller intrinsic metallicity spreads than dwarf galaxies due to their shorter star formation timescales. Here we use individual spectroscopic [Fe/H] measurements of stars in 19 Local Group dwarf galaxies, 13 Galactic open clusters, and 49 globular clusters to show that star cluster and dwarf galaxy linear metallicity distributions are binomial in form, with all objects showing strong correlations between their mean linear metallicity Z-bar and intrinsic spread in metallicity σ(Z) 2 . A plot of σ(Z) 2 versus Z-bar shows that the correlated relationships are offset for the dwarf galaxies from the star clusters. The common binomial nature of these linear metallicity distributions can be explained with a simple inhomogeneous chemical evolution model, where the star cluster and dwarf galaxy behavior in the σ(Z) 2 - Z-bar diagram is reproduced in terms of the number of enrichment events, covering fraction, and intrinsic size of the enriched regions. The inhomogeneity of the self-enrichment sets the slope for the observed dwarf galaxy σ(Z) 2 - Z-bar correlation. The offset of the star cluster sequence from that of the dwarf galaxies is due to pre-enrichment, and the slope of the star cluster sequence represents the remnant signature of the self-enriched history of their host galaxies. The offset can be used to separate star clusters from dwarf galaxies without a priori knowledge of their luminosity or dynamical mass. The application of the inhomogeneous model to the σ(Z) 2 - Z-bar relationship provides a numerical formalism to connect the self-enrichment and pre-enrichment between star clusters and dwarf galaxies using physically motivated chemical enrichment parameters. Therefore we suggest that the σ(Z) 2 - Z-bar relationship can provide insight into what drives the efficiency of star formation and chemical evolution in galaxies, and is an important prediction for galaxy simulation models to reproduce.

  12. Uranium enrichment (a strategy analysis overview)

    International Nuclear Information System (INIS)

    Blahnik, C.


    An analysis of available information on enrichment technology, separative work supply and demand, and SWU cost is presented. Estimates of present and future enrichment costs are provided for use in strategy analyses of alternate nuclear fuel cycles and systems. (auth)

  13. High enrichment to low enrichment core's conversion. Accidents analysis

    International Nuclear Information System (INIS)

    Abbate, P.; Rubio, R.; Doval, A.; Lovotti, O.


    This work analyzes the different accidents that may occur in the reactor's facility after the 20% high-enriched uranium core's conversion. The reactor (of 5 thermal Mw), built in the 50's and 60's, is of the 'swimming pool' type, with light water and fuel elements of the curve plates MTR type, enriched at 93.15 %. This analysis includes: a) accidents by reactivity insertion; b) accidents by coolant loss; c) analysis by flow loss and d) fission products release. (Author) [es

  14. Comprehensive cluster analysis with Transitivity Clustering. (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan


    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  15. A Detailed Study of Chemical Enrichment History of Galaxy Clusters out to Virial Radius (United States)

    Loewenstein, Michael

    The origin of the metal enrichment of the intracluster medium (ICM) represents a fundamental problem in extragalactic astrophysics, with implications for our understanding of how stars and galaxies form, the nature of Type Ia supernova (SNIa) progenitors, and the thermal history of the ICM. These heavy elements are ultimately synthesized by supernova (SN) explosions; however, the details of the sites of metal production and mechanisms that transport metals to the ICM remain unclear. To make progress, accurate abundance profiles for multiple elements extending from the cluster core out to the virial radius (r180) are required for a significant cluster sample. We propose an X-ray spectroscopic study of a carefully-chosen sample of archival Suzaku and XMM-Newton observations of 23 clusters: XMM-Newton data probe the cluster temperature and abundances out to (0.5-1)r500, while Suzaku data probe the cluster outskirts. A method devised by our team to utilize all elements with emission lines in the X-ray bandpass to measure the relative contributions of supernova explosions by direct modeling of their X-ray spectra will be applied in order to constrain the demographics of the enriching supernova population. In addition we will conduct a stacking analysis of our already existing Suzaku and XMM-Newton cluster spectra to search for weak emssion lines that are important SN diagnostics, and to look for trends with cluster mass and redshift. The funding we propose here will also support the data analysis of our recent Suzaku observations of the archetypal cluster A3112 (200 ks each on the core and outskirts). Our data analysis, intepreted using theoretical models we have developed, will enable us to constrain the star formation history, SN demographics, and nature of SNIa progenitors associated with galaxy cluster stellar populations - and, hence, directly addresess NASA s Strategic Objective 2.4.2 in Astrophysics that aims to improve the understanding of how the Universe works

  16. [Cluster analysis in biomedical researches]. (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D


    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  17. Uranium enrichment management review: summary of analysis

    Energy Technology Data Exchange (ETDEWEB)


    In May 1980, the Assistant Secretary for Resource Applications within the Department of Energy requested that a group of experienced business executives be assembled to review the operation, financing, and management of the uranium enrichment enterprise as a basis for advising the Secretary of Energy. After extensive investigation, analysis, and discussion, the review group presented its findings and recommendations in a report on December 2, 1980. The following pages contain background material on which that final report was based. This report is arranged in chapters that parallel those of the uranium enrichment management review final report - chapters that contain summaries of the review group's discussion and analyses in six areas: management of operations and construction; long-range planning; marketing of enrichment services; financial management; research and development; and general management. Further information, in-depth analysis, and discussion of suggested alternative management practices are provided in five appendices.

  18. Uranium enrichment management review: summary of analysis

    International Nuclear Information System (INIS)


    In May 1980, the Assistant Secretary for Resource Applications within the Department of Energy requested that a group of experienced business executives be assembled to review the operation, financing, and management of the uranium enrichment enterprise as a basis for advising the Secretary of Energy. After extensive investigation, analysis, and discussion, the review group presented its findings and recommendations in a report on December 2, 1980. The following pages contain background material on which that final report was based. This report is arranged in chapters that parallel those of the uranium enrichment management review final report - chapters that contain summaries of the review group's discussion and analyses in six areas: management of operations and construction; long-range planning; marketing of enrichment services; financial management; research and development; and general management. Further information, in-depth analysis, and discussion of suggested alternative management practices are provided in five appendices


    International Nuclear Information System (INIS)

    Schiavon, Ricardo P.; Caldwell, Nelson; Conroy, Charlie; Graves, Genevieve J.; Strader, Jay; MacArthur, Lauren A.; Courteau, Stéphane; Harding, Paul


    In the past decade, the notion that globular clusters (GCs) are composed of coeval stars with homogeneous initial chemical compositions has been challenged by growing evidence that they host an intricate stellar population mix, likely indicative of a complex history of star formation and chemical enrichment. Several models have been proposed to explain the existence of multiple stellar populations in GCs, but no single model provides a fully satisfactory match to existing data. Correlations between chemistry and global parameters such as cluster mass or luminosity are fundamental clues to the physics of GC formation. In this Letter, we present an analysis of the mean abundances of Fe, Mg, C, N, and Ca for 72 old GCs from the Andromeda galaxy. We show for the first time that there is a correlation between the masses of GCs and the mean stellar abundances of nitrogen, spanning almost two decades in mass. This result sheds new light on the formation of GCs, providing important constraints on their internal chemical evolution and mass loss history

  20. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K


    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  1. Cluster analysis of track structure

    International Nuclear Information System (INIS)

    Michalik, V.


    One of the possibilities of classifying track structures is application of conventional partition techniques of analysis of multidimensional data to the track structure. Using these cluster algorithms this paper attempts to find characteristics of radiation reflecting the spatial distribution of ionizations in the primary particle track. An absolute frequency distribution of clusters of ionizations giving the mean number of clusters produced by radiation per unit of deposited energy can serve as this characteristic. General computation techniques used as well as methods of calculations of distributions of clusters for different radiations are discussed. 8 refs.; 5 figs

  2. Analysis of a PHWR slightly enriched fuel

    International Nuclear Information System (INIS)

    Notari, C.; Marajofsky, A.


    It is widely known that the use of slightly enriched uranium in PHWR reactors presents economic advantages derived from the fact that less uranium is required for producing the same amount of energy. Several studies related with the use of this alternative in Atucha I NPP have been performed. The fuel assembly geometry considered up to now has been almost identical to the natural uranium one. In this work a modification consisting in the use of annular pellets in the outer ring of the cluster is analyzed. This design produces several performance benefits. The redistribution of the power in the fuel improves the maximum to average bundle power ratio. The improvement achieved depends on the void volume in the pellets which at the same time represents a certain burnup decrease. These parameters (power ratios and burnup loss) are quantified for the Atucha I and Embalse NPPs. This design improves the fuel behaviour with respect to the burnup extension derived from the slight enrichment. It is also interesting in case an overall power increase is considered. (author). 16 refs, 8 figs, 1 tab

  3. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture. (United States)

    Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M


    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if

  4. Criticality analysis in uranium enrichment plant

    International Nuclear Information System (INIS)

    Okamoto, Tsuyoshi; Kiyose, Ryohei


    In a large scale uranium enrichment plant, uranium inventory in cascade rooms is not very large in quantity, but the facilities dealing with the largest quantity of uranium in that process are the UF 6 gas supply system and the blending system for controlling the product concentration. When UF 6 spills out of these systems, the enriched uranium is accumulated, and the danger of criticality accident is feared. If a NaF trap is placed at the forestage of waste gas treatment system, plenty of UF 6 and HF are adsorbed together in the NaF trap. Thus, here is the necessity of checking the safety against criticality. Various assumptions were made to perform the computation surveying the criticality of the system composed of UF 6 and HF adsorbed on NaF traps with WIMS code (transport analysis). The minimum critical radius resulted in about 53 cm in case of 3.5% enriched fuel for light water reactors. The optimum volume ratio of fissile material in the double salt UF 6 .2NaF and NaF.HF is about 40 vol. %. While, criticality survey computation was also made for the annular NaF trap having the central cooling tube, and it was found that the effect of cooling tube radius did not decrease the multiplication factor up to the cooling tube radius of about 5 cm. (Wakatsuki, Y.)

  5. NAA analysis of enriched Zn-68 by

    International Nuclear Information System (INIS)

    Rafii, H.; Mirzaei, M.; Mirzajani, N.; Sardari, D.; Shahabi, I.; Majedi, F.


    Excessive application of enriched isotopes in various fields of sciences and industry necessitates measuring their abundant by a precise and rapid methos. Beside the inductively coupled plasma mass spectrometry, the thermal neutron activation analysis, NAA, is an alternative method, which is capable to determine trace amounts of elements as well as the elemental abundance. In this article the enrichment of Zn-68 in two different samples has been studied by mean of NAA. One sample was separated by an electromagnetic system in our center and the other was purchased from a French company, Cortecnet. The neutron irradiation was took place in MNSR reactor by flux 10 1 1n/cm 2 sec. for 30 min. and the produced radioactivity from Zn-69 m was measured one day after irradiation by HPGe detector. The results shows a good agreement with the reported ones and its low derivation of about ±3.05 indicates that the NAA is a precise, rapid, and supplemental method for analyzing enriched Zn-68

  6. Ranking metrics in gene set enrichment analysis: do they matter? (United States)

    Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna


    There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner

  7. GOMA: functional enrichment analysis tool based on GO modules

    Institute of Scientific and Technical Information of China (English)

    Qiang Huang; Ling-Yun Wu; Yong Wang; Xiang-Sun Zhang


    Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology.A variety of enrichment analysis tools have been developed in recent years,but most output a long list of significantly enriched terms that are often redundant,making it difficult to extract the most meaningful functions.In this paper,we present GOMA,a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules.With this method,we systematically revealed functional GO modules,i.e.,groups of functionally similar GO terms,via an optimization model and then ranked them by enrichment scores.Our new method simplifies enrichment analysis results by reducing redundancy,thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results.

  8. Network Analysis Tools: from biological networks to clusters and pathways. (United States)

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques


    Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.

  9. Cluster analysis for portfolio optimization


    Vincenzo Tola; Fabrizio Lillo; Mauro Gallegati; Rosario N. Mantegna


    We consider the problem of the statistical uncertainty of the correlation matrix in the optimization of a financial portfolio. We show that the use of clustering algorithms can improve the reliability of the portfolio in terms of the ratio between predicted and realized risk. Bootstrap analysis indicates that this improvement is obtained in a wide range of the parameters N (number of assets) and T (investment horizon). The predicted and realized risk level and the relative portfolio compositi...

  10. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool. (United States)

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi


    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  11. Effects of progressive resistance training combined with a protein-enriched lean red meat diet on health-related quality of life in elderly women: secondary analysis of a 4-month cluster randomised controlled trial. (United States)

    Torres, Susan J; Robinson, Sian; Orellana, Liliana; O'Connell, Stella L; Grimes, Carley A; Mundell, Niamh L; Dunstan, David W; Nowson, Caryl A; Daly, Robin M


    Resistance training (RT) and increased dietary protein are recommended to attenuate age-related muscle loss in the elderly. This study examined the effect of a lean red meat protein-enriched diet combined with progressive resistance training (RT+Meat) on health-related quality of life (HR-QoL) in elderly women. In this 4-month cluster randomised controlled trial, 100 women aged 60-90 years (mean 73 years) from self-care retirement villages participated in RT twice a week and were allocated either 160 g/d (cooked) lean red meat consumed across 2 meals/d, 6 d/week or ≥1 serving/d (25-30 g) carbohydrates (control group, CRT). HR-QoL (SF-36 Health Survey questionnaire), lower limb maximum muscle strength and lean tissue mass (LTM) (dual-energy X-ray absorptiometry) were assessed at baseline and 4 months. In all, ninety-one women (91 %) completed the study (RT+Meat (n 48); CRT (n 43)). Mean protein intake was greater in RT+Meat than CRT throughout the study (1·3 (sd 0·3) v. 1·1 (sd 0·3) g/kg per d, P<0·05). Exercise compliance (74 %) was not different between groups. After 4 months there was a significant net benefit in the RT+Meat compared with CRT group for overall HR-QoL and the physical component summary (PCS) score (P<0·01), but there were no changes in either group in the mental component summary (MCS) score. Changes in lower limb muscle strength, but not LTM, were positively associated with changes in overall HR-QoL (muscle strength, β: 2·2 (95 % CI 0·1, 4·3), P<0·05). In conclusion, a combination of RT and increased dietary protein led to greater net benefits in overall HR-QoL in elderly women compared with RT alone, which was because of greater improvements in PCS rather than MCS.


    Energy Technology Data Exchange (ETDEWEB)

    Petropoulou, V.; Vilchez, J.; Iglesias-Paramo, J. [Instituto de Astrofisica de Andalucia-C.S.I.C., Glorieta de la Astronomia, 18008 Granada (Spain)


    In this paper, we study the chemical history of low-mass star-forming (SF) galaxies in the local universe clusters Coma, A1367, A779, and A634. The aim of this work is to search for the imprint of the environment on the chemical evolution of these galaxies. Galaxy chemical evolution is linked to the star formation history, as well as to the gas interchange with the environment, and low-mass galaxies are well known to be vulnerable systems to environmental processes affecting both these parameters. For our study we have used spectra from the SDSS-III DR8. We have examined the spectroscopic properties of SF galaxies of stellar masses 10{sup 8}-10{sup 10} M{sub Sun }, located from the core to the cluster's outskirts. The gas-phase O/H and N/O chemical abundances have been derived using the latest empirical calibrations. We have examined the mass-metallicity relation of cluster galaxies, finding well-defined sequences. The slope of these sequences, for galaxies in low-mass clusters and galaxies at large cluster-centric distances, follows the predictions of recent hydrodynamic models. A flattening of this slope has been observed for galaxies located in the core of the two more massive clusters of the sample, principally in Coma, suggesting that the imprint of the cluster environment on the chemical evolution of SF galaxies should be sensitive to both the galaxy mass and the host cluster mass. The H I gas content of Coma and A1367 galaxies indicates that low-mass SF galaxies, located at the core of these clusters, have been severely affected by ram-pressure stripping (RPS). The observed mass-dependent enhancement of the metal content of low-mass galaxies in dense environments seems plausible, according to hydrodynamic simulations. This enhanced metal enrichment could be produced by the combination of effects such as wind reaccretion, due to pressure confinement by the intracluster medium (ICM), and the truncation of gas infall, as a result of the RPS. Thus, the


    International Nuclear Information System (INIS)

    Petropoulou, V.; Vílchez, J.; Iglesias-Páramo, J.


    In this paper, we study the chemical history of low-mass star-forming (SF) galaxies in the local universe clusters Coma, A1367, A779, and A634. The aim of this work is to search for the imprint of the environment on the chemical evolution of these galaxies. Galaxy chemical evolution is linked to the star formation history, as well as to the gas interchange with the environment, and low-mass galaxies are well known to be vulnerable systems to environmental processes affecting both these parameters. For our study we have used spectra from the SDSS-III DR8. We have examined the spectroscopic properties of SF galaxies of stellar masses 10 8 -10 10 M ☉ , located from the core to the cluster's outskirts. The gas-phase O/H and N/O chemical abundances have been derived using the latest empirical calibrations. We have examined the mass-metallicity relation of cluster galaxies, finding well-defined sequences. The slope of these sequences, for galaxies in low-mass clusters and galaxies at large cluster-centric distances, follows the predictions of recent hydrodynamic models. A flattening of this slope has been observed for galaxies located in the core of the two more massive clusters of the sample, principally in Coma, suggesting that the imprint of the cluster environment on the chemical evolution of SF galaxies should be sensitive to both the galaxy mass and the host cluster mass. The H I gas content of Coma and A1367 galaxies indicates that low-mass SF galaxies, located at the core of these clusters, have been severely affected by ram-pressure stripping (RPS). The observed mass-dependent enhancement of the metal content of low-mass galaxies in dense environments seems plausible, according to hydrodynamic simulations. This enhanced metal enrichment could be produced by the combination of effects such as wind reaccretion, due to pressure confinement by the intracluster medium (ICM), and the truncation of gas infall, as a result of the RPS. Thus, the properties of the ICM

  14. Concurrent formation of supermassive stars and globular clusters: implications for early self-enrichment (United States)

    Gieles, Mark; Charbonnel, Corinne; Krause, Martin G. H.; Hénault-Brunet, Vincent; Agertz, Oscar; Lamers, Henny J. G. L. M.; Bastian, Nathan; Gualandris, Alessia; Zocchi, Alice; Petts, James A.


    We present a model for the concurrent formation of globular clusters (GCs) and supermassive stars (SMSs, ≳ 103 M⊙) to address the origin of the HeCNONaMgAl abundance anomalies in GCs. GCs form in converging gas flows and accumulate low-angular momentum gas, which accretes onto protostars. This leads to an adiabatic contraction of the cluster and an increase of the stellar collision rate. A SMS can form via runaway collisions if the cluster reaches sufficiently high density before two-body relaxation halts the contraction. This condition is met if the number of stars ≳ 106 and the gas accretion rate ≳ 105 M⊙/Myr, reminiscent of GC formation in high gas-density environments, such as - but not restricted to - the early Universe. The strong SMS wind mixes with the inflowing pristine gas, such that the protostars accrete diluted hot-hydrogen burning yields of the SMS. Because of continuous rejuvenation, the amount of processed material liberated by the SMS can be an order of magnitude higher than its maximum mass. This `conveyor-belt' production of hot-hydrogen burning products provides a solution to the mass budget problem that plagues other scenarios. Additionally, the liberated material is mildly enriched in helium and relatively rich in other hot-hydrogen burning products, in agreement with abundances of GCs today. Finally, we find a super-linear scaling between the amount of processed material and cluster mass, providing an explanation for the observed increase of the fraction of processed material with GC mass. We discuss open questions of this new GC enrichment scenario and propose observational tests.

  15. IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis. (United States)

    Zhang, Fan; Drabier, Renee


    Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics). We designed the Integrated Pathway Analysis Database for Systematic Enrichment Analysis (IPAD,, defining inter-association between pathway, disease, drug and organ specificity, based on six criteria: 1) comprehensive pathway coverage; 2) gene/protein to pathway/disease/drug/organ association; 3) inter-association between pathway, disease, drug, and organ; 4) multiple and quantitative measurement of enrichment and inter-association; 5) assessment of enrichment and inter-association analysis with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources; and 6) cross-linking of

  16. Data analysis for neutron monitoring in an enrichment facility

    International Nuclear Information System (INIS)

    Markin, J.T.; Stewart, J.E.; Goldman, A.S.


    Area monitoring of neutron radiation to detect high-enriched uranium production is a potential strategy for inspector verification of operations in the cascade area of a centrifuge enrichment facility. This paper discusses the application of statistical filtering and hypothesis testing procedures to experimental data taken in an enrichment facility. The results demonstrate that these data analysis methods can enhance detection of facility misoperation by neutron monitoring

  17. The s-process enrichment of the globular clusters M4 and M22

    Energy Technology Data Exchange (ETDEWEB)

    Shingles, Luke J.; Karakas, Amanda I.; Fishlock, Cherie K.; Yong, David; Da Costa, Gary S.; Marino, Anna F. [Research School of Astronomy and Astrophysics, Australian National University, Canberra, ACT 2611 (Australia); Hirschi, Raphael, E-mail: [Institute for the Physics and Mathematics of the Universe (WPI), University of Tokyo, 5-1-5 Kashiwanoha, 277-8583 Kashiwa (Japan)


    We investigate the enrichment in elements produced by the slow neutron-capture process (s-process) in the globular clusters M4 (NGC 6121) and M22 (NGC 6656). Stars in M4 have homogeneous abundances of Fe and neutron-capture elements, but the entire cluster is enhanced in s-process elements (Sr, Y, Ba, Pb) relative to other clusters with a similar metallicity. In M22, two stellar groups exhibit different abundances of Fe and s-process elements. By subtracting the mean abundances of s-poor from s-rich stars, we derive s-process residuals or empirical s-process distributions for M4 and M22. We find that the s-process distribution in M22 is more weighted toward the heavy s-peak (Ba, La, Ce) and Pb than M4, which has been enriched mostly with light s-peak elements (Sr, Y, Zr). We construct simple chemical evolution models using yields from massive star models that include rotation, which dramatically increases s-process production at low metallicity. We show that our massive star models with rotation rates of up to 50% of the critical (break-up) velocity and changes to the preferred {sup 17}O(α, γ){sup 21}Ne rate produce insufficient heavy s-elements and Pb to match the empirical distributions. For models that incorporate asymptotic giant branch yields, we find that intermediate-mass yields (with a {sup 22}Ne neutron source) alone do not reproduce the light-to-heavy s-element ratios for M4 and M22, and that a small contribution from models with a {sup 13}C pocket is required. With our assumption that {sup 13}C pockets form for initial masses below a transition range between 3.0 and 3.5 M {sub ☉}, we match the light-to-heavy s-element ratio in the s-process residual of M22 and predict a minimum enrichment timescale of between 240 and 360 Myr. Our predicted value is consistent with the 300 Myr upper limit age difference between the two groups derived from isochrone fitting.

  18. Comments on Smith Barney's uranium enrichment analysis

    International Nuclear Information System (INIS)

    Rezendes, V.S.


    In a May 1990 report, Smith Barney, Harris Upham and Co. concluded that DOE's uranium enrichment program should be restructured as a government corporation; all past costs have been recovered, and DOE's customers have been overcharged about $1.2 billion; the government should retain responsibility for environment and decommissioning costs associated with enriched uranium production before the corporation's formation; and at some future time the corporation could be sold to the private sector. This report agrees with Smith Barney's recommendation to restructure the enrichment program as a government corporation, but disagrees that DOE's customers have paid for all past costs. According to the author, Smith Barney did not identify the total environmental or decommissioning costs between the government and the corporation. Since these costs are largely undefined, but could amount to billions, Congress should immediately require the program to begin setting aside funds for these costs. DOE estimates that government purchases are responsible for 50 percent of the decommissioning costs; therefore, the government should share these costs by matching the corporation's fund contributions. This requirement should continue until the existing plants have been decommissioned

  19. Potential for supernova-induced chemical enrichment of protoglobular cluster clouds

    International Nuclear Information System (INIS)

    Dopita, M.A.; Smith, G.H.; Dominion Astrophysical Observatory, Victoria, Canada)


    This paper seeks to explain the large internal abundance variations that are seen in the globular cluster Omega Cen in terms of supernova-induced chemical enrichment that occurred when the cluster was still largely in a gaseous phase and star formation was continuing. Using a simple power-law density model of this protoglobular gas cloud, the conditions under which this can occur have been established analytically. Clouds less massive than about 100,000 solar masses are completely disrupted by supernova explosions in their adiabatic phase. In clouds of greater mass, supernova explosions occurring near the tidal radius tend to lose their hot gas and metals to the intercloud medium. For explosions occurring closer to the mass center the ejecta must be slowed below the escape velocity, and this can only occur in clouds more massive than about 3 x 10 to the 6th solar masses. If this condition is met, then the slow isothermal momentum-conserving shocks generated by the supernova explosions may eventually induce secondary star formation. For such shocks converging on the mass center, it is found that a cloud mass of at least 10 to the 7th solar masses is required for this process to be efficient. From the observed properties of Omega Cen, a primordial mass of order 10 to the 8th solar masses is estimated, which emphasizes the unusual character of this object. 39 references

  20. Cluster analysis in phenotyping a Portuguese population. (United States)

    Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J


    Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.

  1. The origin of ICM enrichment in the outskirts of present-day galaxy clusters from cosmological hydrodynamical simulations (United States)

    Biffi, V.; Planelles, S.; Borgani, S.; Rasia, E.; Murante, G.; Fabjan, D.; Gaspari, M.


    The uniformity of the intracluster medium (ICM) enrichment level in the outskirts of nearby galaxy clusters suggests that chemical elements were deposited and widely spread into the intergalactic medium before the cluster formation. This observational evidence is supported by numerical findings from cosmological hydrodynamical simulations, as presented in Biffi et al., including the effect of thermal feedback from active galactic nuclei. Here, we further investigate this picture, by tracing back in time the spatial origin and metallicity evolution of the gas residing at z = 0 in the outskirts of simulated galaxy clusters. In these regions, we find a large distribution of iron abundances, including a component of highly enriched gas, already present at z = 2. At z > 1, the gas in the present-day outskirts was distributed over tens of virial radii from the main cluster and had been already enriched within high-redshift haloes. At z = 2, about 40 {per cent} of the most Fe-rich gas at z = 0 was not residing in any halo more massive than 10^{11} h^{-1} M_{⊙} in the region and yet its average iron abundance was already 0.4, w.r.t. the solar value by Anders & Grevesse. This confirms that the in situ enrichment of the ICM in the outskirts of present-day clusters does not play a significant role, and its uniform metal abundance is rather the consequence of the accretion of both low-metallicity and pre-enriched (at z > 2) gas, from the diffuse component and through merging substructures. These findings do not depend on the mass of the cluster nor on its core properties.

  2. Mass-invariance of the iron enrichment in the hot haloes of massive ellipticals, groups, and clusters of galaxies (United States)

    Mernier, F.; de Plaa, J.; Werner, N.; Kaastra, J. S.; Raassen, A. J. J.; Gu, L.; Mao, J.; Urdampilleta, I.; Truong, N.; Simionescu, A.


    X-ray measurements find systematically lower Fe abundances in the X-ray emitting haloes pervading groups (kT ≲ 1.7 keV) than in clusters of galaxies. These results have been difficult to reconcile with theoretical predictions. However, models using incomplete atomic data or the assumption of isothermal plasmas may have biased the best fit Fe abundance in groups and giant elliptical galaxies low. In this work, we take advantage of a major update of the atomic code in the spectral fitting package SPEX to re-evaluate the Fe abundance in 43 clusters, groups, and elliptical galaxies (the CHEERS sample) in a self-consistent analysis and within a common radius of 0.1r500. For the first time, we report a remarkably similar average Fe enrichment in all these systems. Unlike previous results, this strongly suggests that metals are synthesised and transported in these haloes with the same average efficiency across two orders of magnitude in total mass. We show that the previous metallicity measurements in low temperature systems were biased low due to incomplete atomic data in the spectral fitting codes. The reasons for such a code-related Fe bias, also implying previously unconsidered biases in the emission measure and temperature structure, are discussed.

  3. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion. (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K


    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  4. Root cause analysis with enriched process logs

    NARCIS (Netherlands)

    Suriadi, S.; Ouyang, C.; Aalst, van der W.M.P.; Hofstede, ter A.H.M.; La Rosa, M.; Soffer, P.


    n the field of process mining, the use of event logs for the purpose of root cause analysis is increasingly studied. In such an analysis, the availability of attributes/features that may explain the root cause of some phenomena is crucial. Currently, the process of obtaining these attributes from

  5. Magnesium isotopes: a tool to understand self-enrichment in globular clusters (United States)

    Ventura, P.; D'Antona, F.; Imbriani, G.; Di Criscienzo, M.; Dell'Agli, F.; Tailo, M.


    A critical issue in the asymptotic giant branch (AGB) self-enrichment scenario for the formation of multiple populations in globular clusters (GCs) is the inability to reproduce the magnesium isotopic ratios, despite the model in principle can account for the depletion of magnesium. In this work, we analyse how the uncertainties on the various p-capture cross sections affect the results related to the magnesium content of the ejecta of AGB stars. The observed distribution of the magnesium isotopes and of the overall Mg-Al trend in M13 and NGC 6752 are successfully reproduced when the proton-capture rate by 25Mg at the temperatures ˜100 MK, in particular the 25Mg(p, γ)26Alm channel, is enhanced by a factor ˜3 with respect to the most recent experimental determinations. This assumption also allows us to reproduce the full extent of the Mg spread and the Mg-Si anticorrelation observed in NGC 2419. The uncertainties in the rate of the 25Mg(p, γ)26Alm reaction at the temperatures of interest here leave space for our assumption and we suggest that new experimental measurements are needed to settle this problem. We also discuss the competitive model based on the supermassive star nucleosynthesis.

  6. Robust cluster analysis and variable selection

    CERN Document Server

    Ritter, Gunter


    Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of bot

  7. Exact WKB analysis and cluster algebras

    International Nuclear Information System (INIS)

    Iwaki, Kohei; Nakanishi, Tomoki


    We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)

  8. Separate enrichment analysis of pathways for up- and downregulated genes. (United States)

    Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng


    Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.

  9. The mechanism of solute-enriched clusters formation in neutron-irradiated pressure vessel steels: The case of Fe-Cu model alloys

    Energy Technology Data Exchange (ETDEWEB)

    Subbotin, A.V., E-mail: [Scientific and Production Complex Atomtechnoprom, Moscow 119180 (Russian Federation); Panyukov, S.V., E-mail: [PN Lebedev Physics Institute, Russian Academy of Sciences, Moscow 117924 (Russian Federation)


    Mechanism of solute-enriched clusters formation in neutron-irradiated pressure vessel steels is proposed and developed in case of Fe-Cu model alloys. The suggested solute-drag mechanism is analogous to the well-known zone-refining process. We show that the obtained results are in good agreement with available experimental data on the parameters of clusters enriched with the alloying elements. Our model explains why the formation of solute-enriched clusters does not happen in austenitic stainless steels with fcc lattice structure. It also allows to quantify the method of evaluation of neutron irradiation dose for the process of RPV steels hardening.

  10. An automated solution enrichment system for uranium analysis

    International Nuclear Information System (INIS)

    Jones, S.A.; Sparks, R.; Sampson, T.; Parker, J.; Horley, E.; Kelly, T.


    An automated Solution Enrichment system (SES) for analysis of Uranium and U-235 isotopes in process samples has been developed through a joint effort between Los Alamos National Laboratory and Martin Marietta Energy systems, Portsmouth Gaseous Diffusion Plant. This device features an advanced robotics system which in conjuction with stabilized passive gamma-ray and X-ray fluorescence detectors provides for rapid, non-destructive analyses of process samples for improved special nuclear material accountability and process control

  11. Cluster analysis of obesity and asthma phenotypes.

    Directory of Open Access Journals (Sweden)

    E Rand Sutherland

    Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals

  12. Quantitative mass spectrometric analysis of glycoproteins combined with enrichment methods. (United States)

    Ahn, Yeong Hee; Kim, Jin Young; Yoo, Jong Shin


    Mass spectrometry (MS) has been a core technology for high sensitive and high-throughput analysis of the enriched glycoproteome in aspects of quantitative assays as well as qualitative profiling of glycoproteins. Because it has been widely recognized that aberrant glycosylation in a glycoprotein may involve in progression of a certain disease, the development of efficient analysis tool for the aberrant glycoproteins is very important for deep understanding about pathological function of the glycoprotein and new biomarker development. This review first describes the protein glycosylation-targeting enrichment technologies mainly employing solid-phase extraction methods such as hydrizide-capturing, lectin-specific capturing, and affinity separation techniques based on porous graphitized carbon, hydrophilic interaction chromatography, or immobilized boronic acid. Second, MS-based quantitative analysis strategies coupled with the protein glycosylation-targeting enrichment technologies, by using a label-free MS, stable isotope-labeling, or targeted multiple reaction monitoring (MRM) MS, are summarized with recent published studies. © 2014 The Authors. Mass Spectrometry Reviews Published by Wiley Periodicals, Inc.

  13. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study. (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein


    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. Isotopic analysis of uranium hexafluoride highly enriched in U-235

    International Nuclear Information System (INIS)

    Chaussy, L.; Boyer, R.


    Isotopic analysis of uranium in the form of the hexafluoride by mass-spectrometry gives gross results which are not very accurate. Using a linear interpolation method applied to two standards it is possible to correct for this inaccuracy as long as the isotopic concentrations are less than about 10 per cent in U-235. Above this level, the interpolations formula overestimates the results, especially if the enrichment of the analyzed samples is higher than 1.3 with respect to the standards. A formula is proposed for correcting the interpolation equation and for the extending its field of application to high values of the enrichment (≅2) and of the concentration. It is shown that by using this correction the results obtained have an accuracy which depends practically only on that of the standards, taking into account the dispersion in the measurements. (authors) [fr

  15. Factor Analysis for Clustered Observations. (United States)

    Longford, N. T.; Muthen, B. O.


    A two-level model for factor analysis is defined, and formulas for a scoring algorithm for this model are derived. A simple noniterative method based on decomposition of total sums of the squares and cross-products is discussed and illustrated with simulated data and data from the Second International Mathematics Study. (SLD)

  16. LISSAT Analysis of a Generic Centrifuge Enrichment Plant

    International Nuclear Information System (INIS)

    Lambert, H; Elayat, H A; O'Connell, W J; Szytel, L; Dreicer, M


    The U.S. Department of Energy (DOE) is interested in developing tools and methods for use in designing and evaluating safeguards systems for current and future plants in the nuclear power fuel cycle. The DOE is engaging several DOE National Laboratories in efforts applied to safeguards for chemical conversion plants and gaseous centrifuge enrichment plants. As part of the development, Lawrence Livermore National Laboratory has developed an integrated safeguards system analysis tool (LISSAT). This tool provides modeling and analysis of facility and safeguards operations, generation of diversion paths, and evaluation of safeguards system effectiveness. The constituent elements of diversion scenarios, including material extraction and concealment measures, are structured using directed graphs (digraphs) and fault trees. Statistical analysis evaluates the effectiveness of measurement verification plans and randomly timed inspections. Time domain simulations analyze significant scenarios, especially those involving alternate time ordering of events or issues of timeliness. Such simulations can provide additional information to the fault tree analysis and can help identify the range of normal operations and, by extension, identify additional plant operational signatures of diversions. LISSAT analyses can be used to compare the diversion-detection probabilities for individual safeguards technologies and to inform overall strategy implementations for present and future plants. Additionally, LISSAT can be the basis for a rigorous cost-effectiveness analysis of safeguards and design options. This paper will describe the results of a LISSAT analysis of a generic centrifuge enrichment plant. The paper will describe the diversion scenarios analyzed and the effectiveness of various safeguards systems alternatives

  17. Enriched gas in clusters and the dynamics of galaxies and clusters: implications for theories of galaxy formation

    International Nuclear Information System (INIS)

    Binney, J.; Silk, J.


    Recent developments in relation to the origin of galaxies are cited: the discovery that the intergalactic medium which seems to pervade rich clusters of galaxies has an iron abundance that lies within an order of magnitude of the solar value; the discovery that elliptical galaxies rotate much more slowly than the models of these galaxies had predicted; and the results of studies of cosmological infall in the context of the formation of galaxies and galaxy clusters, which have shown that the resulting density profile is fairly insensitive to initial conditions. After discussing the implications of these recent observations of X-ray clusters and of the rotation of elliptical galaxies, an attempt is made to construct a picture of the formation of elliptical and spiral galaxies in which galaxies form continuously from redshift z approximately 100 onwards. It is suggested that at a redshift z of roughly 5, a fundamental change occurred in the manner in which the cosmic material fragmented into stellar objects. It seems possible that explanations of a variety of puzzling aspects of galactic evolution, including the formation of Population I disks, the origin of the hot intracluster gas, the mass-to-light ratio stratification of galaxies, and the nature of the galaxy luminosity function, should all be sought in the context of this change of regime. Some remarks are made about gas in poor groups of galaxies and the interaction of disk galaxies with their environments. (U.K.)

  18. Cluster analysis for determining distribution center location (United States)

    Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian


    Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.

  19. Using Enrichment Clusters to Address the Needs of Culturally and Linguistically Diverse Learners (United States)

    Allen, Jennifer K.; Robbins, Margaret A.; Payne, Yolanda Denise; Brown, Katherine Backes


    Using data from teacher interviews, classroom observations, and a professional development workshop, this article explains how one component of the schoolwide enrichment model (SEM) has been implemented at a culturally diverse elementary school serving primarily Latina/o and African American students. Based on a broadened conception of giftedness,…

  20. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations (United States)


    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  1. Semi-supervised consensus clustering for gene expression data analysis


    Wang, Yunli; Pan, Youlian


    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...


    International Nuclear Information System (INIS)

    Maoz, Dan; Sharon, Keren; Avishay Gal-Yam


    Knowledge of the supernova (SN) delay time distribution (DTD)-the SN rate versus time that would follow a hypothetical brief burst of star formation-can shed light on SN progenitors and physics, as well as on the timescales of chemical enrichment in different environments. We compile recent measurements of the Type-Ia SN (SN Ia) rate in galaxy clusters at redshifts from z = 0 out to z = 1.45, just 2 Gyr after cluster star formation at z ∼ 3. We review the plausible range for the observed total iron-to-stellar mass ratio in clusters, based on the latest data and analyses, and use it to constrain the time-integrated number of SN Ia events in clusters. With these data, we recover the DTD of SNe Ia in cluster environments. The DTD is sharply peaked at the shortest time-delay interval we probe, 0Gyr -1.2±0.3 from t = 400 Myr to a Hubble time can satisfy both constraints. Shallower power laws such as t -1/2 cannot, assuming a single DTD, and a single star formation burst (either brief or extended) at high z. This implies that 50%-85% of SNe Ia explode within 1 Gyr of star formation. DTDs from double-degenerate (DD) models, which generically have ∼t -1 shapes over a wide range of timescales, match the data, but only if their predictions are scaled up by factors of 5-10. Single-degenerate (SD) DTDs always give poor fits to the data, due to a lack of delayed SNe and overall low numbers of SNe. The observations can also be reproduced with a combination of two SN Ia populations-a prompt SD population of SNe Ia that explodes within a few Gyr of star formation, and produces about 60% of the iron mass in clusters, and a DD population that contributes the events seen at z < 1.5. An alternative scenario of a single, prompt, SN Ia population, but a composite star formation history in clusters, consisting of a burst at high z, followed by a constant star formation rate, can reproduce the SN rates, but is at odds with direct measurements of star formation in clusters at 0 < z

  3. Cluster-enriched Yang-Baxter equation from SUSY gauge theories (United States)

    Yamazaki, Masahito


    We propose a new generalization of the Yang-Baxter equation, where the R-matrix depends on cluster y-variables in addition to the spectral parameters. We point out that we can construct solutions to this new equation from the recently found correspondence between Yang-Baxter equations and supersymmetric gauge theories. The S^2 partition function of a certain 2d N=(2,2) quiver gauge theory gives an R-matrix, whereas its FI parameters can be identified with the cluster y-variables.

  4. Preliminary uranium enrichment analysis results using cadmium zinc telluride detectors

    International Nuclear Information System (INIS)

    Lavietes, A.D.; McQuaid, J.H.; Paulus, T.J.


    Lawrence Livermore National Laboratory (LLNL) and EG ampersand G ORTEC have jointly developed a portable ambient-temperature detection system that can be used in a number of application scenarios. The detection system uses a planar cadmium zinc telluride (CZT) detector with custom-designed detector support electronics developed at LLNL and is based on the recently released MicroNOMAD multichannel analyzer (MCA) produced by ORTEC. Spectral analysis is performed using software developed at LLNL that was originally designed for use with high-purity germanium (HPGe) detector systems. In one application, the CZT detection system determines uranium enrichments ranging from less than 3% to over 75% to within accuracies of 20%. The analysis was performed using sample sizes of 200 g or larger and acquisition times of 30 min. The authors have demonstrated the capabilities of this system by analyzing the spectra gathered by the CZT detection system from uranium sources of several enrichments. These experiments demonstrate that current CZT detectors can, in some cases, approach performance criteria that were previously the exclusive domain of larger HPGe detector systems

  5. Globular Cluster Formation at High Density: A Model for Elemental Enrichment with Fast Recycling of Massive-star Debris

    Energy Technology Data Exchange (ETDEWEB)

    Elmegreen, Bruce G., E-mail: [IBM Research Division, T.J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY 10598 (United States)


    The self-enrichment of massive star clusters by p -processed elements is shown to increase significantly with increasing gas density as a result of enhanced star formation rates and stellar scatterings compared to the lifetime of a massive star. Considering the type of cloud core where a globular cluster (GC) might have formed, we follow the evolution and enrichment of the gas and the time dependence of stellar mass. A key assumption is that interactions between massive stars are important at high density, including interactions between massive stars and massive-star binaries that can shred stellar envelopes. Massive-star interactions should also scatter low-mass stars out of the cluster. Reasonable agreement with the observations is obtained for a cloud-core mass of ∼4 × 10{sup 6} M {sub ⊙} and a density of ∼2 × 10{sup 6} cm{sup −3}. The results depend primarily on a few dimensionless parameters, including, most importantly, the ratio of the gas consumption time to the lifetime of a massive star, which has to be low, ∼10%, and the efficiency of scattering low-mass stars per unit dynamical time, which has to be relatively large, such as a few percent. Also for these conditions, the velocity dispersions of embedded GCs should be comparable to the high gas dispersions of galaxies at that time, so that stellar ejection by multistar interactions could cause low-mass stars to leave a dwarf galaxy host altogether. This could solve the problem of missing first-generation stars in the halos of Fornax and WLM.


    Directory of Open Access Journals (Sweden)

    Jana Halčinová


    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  7. Advanced analysis of forest fire clustering (United States)

    Kanevski, Mikhail; Pereira, Mario; Golay, Jean


    Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index

  8. Cluster Analysis in Rapeseed (Brassica Napus L.)

    International Nuclear Information System (INIS)

    Mahasi, J.M


    With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical

  9. Tweets clustering using latent semantic analysis (United States)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul


    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  10. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks. (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin


    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from

  11. Multisource Images Analysis Using Collaborative Clustering

    Directory of Open Access Journals (Sweden)

    Pierre Gançarski


    Full Text Available The development of very high-resolution (VHR satellite imagery has produced a huge amount of data. The multiplication of satellites which embed different types of sensors provides a lot of heterogeneous images. Consequently, the image analyst has often many different images available, representing the same area of the Earth surface. These images can be from different dates, produced by different sensors, or even at different resolutions. The lack of machine learning tools using all these representations in an overall process constraints to a sequential analysis of these various images. In order to use all the information available simultaneously, we propose a framework where different algorithms can use different views of the scene. Each one works on a different remotely sensed image and, thus, produces different and useful information. These algorithms work together in a collaborative way through an automatic and mutual refinement of their results, so that all the results have almost the same number of clusters, which are statistically similar. Finally, a unique result is produced, representing a consensus among the information obtained by each clustering method on its own image. The unified result and the complementarity of the single results (i.e., the agreement between the clustering methods as well as the disagreement lead to a better understanding of the scene. The experiments carried out on multispectral remote sensing images have shown that this method is efficient to extract relevant information and to improve the scene understanding.

  12. Constructing storyboards based on hierarchical clustering analysis (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu


    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  13. Cluster Analysis of Maize Inbred Lines

    Directory of Open Access Journals (Sweden)

    Jiban Shrestha


    Full Text Available The determination of diversity among inbred lines is important for heterosis breeding. Sixty maize inbred lines were evaluated for their eight agro morphological traits during winter season of 2011 to analyze their genetic diversity. Clustering was done by average linkage method. The inbred lines were grouped into six clusters. Inbred lines grouped into Clusters II had taller plants with maximum number of leaves. The cluster III was characterized with shorter plants with minimum number of leaves. The inbred lines categorized into cluster V had early flowering whereas the group into cluster VI had late flowering time. The inbred lines grouped into the cluster III were characterized by higher value of anthesis silking interval (ASI and those of cluster VI had lower value of ASI. These results showed that the inbred lines having widely divergent clusters can be utilized in hybrid breeding programme.

  14. Designing and analysis study of uranium enrichment with gas centrifuge

    International Nuclear Information System (INIS)

    Tsunetoshi Kai


    This note concerns a designing and analysis study of uranium enrichment with a gas centrifuge. At first, one dimensional model is presented and a conventional analytical method is applied to grasp the general idea of a centrifuge performance. Secondly, two-dimensional numerical method is adopted to describe the diffusion phenomena with assumption of simple flow patterns. Parametric surveys are made on the dimension of a centrifuge rotor, the gas feed, withdrawal and circulation system, and operation variables such as feed flow rate, cut and so on. Thirdly, full numerical solutions are obtained for the flow and diffusion equations in static state, using a modified version of the Newton method without neglect of any non-linear term. The numerical results are compared with the experimental data made by Beams et al. and Zippe, and found to be in good agreement. Further, the theoretical pressure and separative power are compared respectively with experimental ones on a comparatively recent centrifuge. The results reveal that the characteristics of separation performance of a centrifuge can be fully described by the present method. Some of inevitable problems are tackled regarding UF 6 gas isotope separation by centrifugation. To examine the influence of the extraneous light gas, the diffusion equations for ternary mixture are solved and also the flow field of binary mixture with large mass difference is obtained to simultaneously solve the Navier-Stokes equations and the diffusion equation.for binary case. Since the gas in the interior region of the rotor is so rarefied that the Navier-Stokes equations cease to be valid, the Burnett equations are solved.for gas flow in a rotating cylinder. Considering that the uranium recovered at a reprocessing plant includes 236 U besides 235 U and 238 U, the concentration distributions of the ternary gas isotopes are determined and a value function is defined for the evaluation of separative work for the multi-component mixture

  15. Gag induces the coalescence of clustered lipid rafts and tetraspanin-enriched microdomains at HIV-1 assembly sites on the plasma membrane. (United States)

    Hogue, Ian B; Grover, Jonathan R; Soheilian, Ferri; Nagashima, Kunio; Ono, Akira


    The HIV-1 structural protein Gag associates with two types of plasma membrane microdomains, lipid rafts and tetraspanin-enriched microdomains (TEMs), both of which have been proposed to be platforms for HIV-1 assembly. However, a variety of studies have demonstrated that lipid rafts and TEMs are distinct microdomains in the absence of HIV-1 infection. To measure the impact of Gag on microdomain behaviors, we took advantage of two assays: an antibody-mediated copatching assay and a Förster resonance energy transfer (FRET) assay that measures the clustering of microdomain markers in live cells without antibody-mediated patching. We found that lipid rafts and TEMs copatched and clustered to a greater extent in the presence of membrane-bound Gag in both assays, suggesting that Gag induces the coalescence of lipid rafts and TEMs. Substitutions in membrane binding motifs of Gag revealed that, while Gag membrane binding is necessary to induce coalescence of lipid rafts and TEMs, either acylation of Gag or binding of phosphatidylinositol-(4,5)-bisphosphate is sufficient. Finally, a Gag derivative that is defective in inducing membrane curvature appeared less able to induce lipid raft and TEM coalescence. A higher-resolution analysis of assembly sites by correlative fluorescence and scanning electron microscopy showed that coalescence of clustered lipid rafts and TEMs occurs predominantly at completed cell surface virus-like particles, whereas a transmembrane raft marker protein appeared to associate with punctate Gag fluorescence even in the absence of cell surface particles. Together, these results suggest that different membrane microdomain components are recruited in a stepwise manner during assembly.

  16. Gag Induces the Coalescence of Clustered Lipid Rafts and Tetraspanin-Enriched Microdomains at HIV-1 Assembly Sites on the Plasma Membrane ▿ (United States)

    Hogue, Ian B.; Grover, Jonathan R.; Soheilian, Ferri; Nagashima, Kunio; Ono, Akira


    The HIV-1 structural protein Gag associates with two types of plasma membrane microdomains, lipid rafts and tetraspanin-enriched microdomains (TEMs), both of which have been proposed to be platforms for HIV-1 assembly. However, a variety of studies have demonstrated that lipid rafts and TEMs are distinct microdomains in the absence of HIV-1 infection. To measure the impact of Gag on microdomain behaviors, we took advantage of two assays: an antibody-mediated copatching assay and a Förster resonance energy transfer (FRET) assay that measures the clustering of microdomain markers in live cells without antibody-mediated patching. We found that lipid rafts and TEMs copatched and clustered to a greater extent in the presence of membrane-bound Gag in both assays, suggesting that Gag induces the coalescence of lipid rafts and TEMs. Substitutions in membrane binding motifs of Gag revealed that, while Gag membrane binding is necessary to induce coalescence of lipid rafts and TEMs, either acylation of Gag or binding of phosphatidylinositol-(4,5)-bisphosphate is sufficient. Finally, a Gag derivative that is defective in inducing membrane curvature appeared less able to induce lipid raft and TEM coalescence. A higher-resolution analysis of assembly sites by correlative fluorescence and scanning electron microscopy showed that coalescence of clustered lipid rafts and TEMs occurs predominately at completed cell surface virus-like particles, whereas a transmembrane raft marker protein appeared to associate with punctate Gag fluorescence even in the absence of cell surface particles. Together, these results suggest that different membrane microdomain components are recruited in a stepwise manner during assembly. PMID:21813604

  17. Using Star Clusters as Tracers of Star Formation and Chemical Evolution: The Chemical Enrichment History of the Large Magellanic Cloud (United States)

    Chilingarian, Igor V.; Asa’d, Randa


    The star formation (SFH) and chemical enrichment (CEH) histories of Local Group galaxies are traditionally studied by analyzing their resolved stellar populations in a form of color–magnitude diagrams obtained with the Hubble Space Telescope. Star clusters can be studied in integrated light using ground-based telescopes to much larger distances. They represent snapshots of the chemical evolution of their host galaxy at different ages. Here we present a simple theoretical framework for the chemical evolution based on the instantaneous recycling approximation (IRA) model. We infer a CEH from an SFH and vice versa using observational data. We also present a more advanced model for the evolution of individual chemical elements that takes into account the contribution of supernovae type Ia. We demonstrate that ages, iron, and α-element abundances of 15 star clusters derived from the fitting of their integrated optical spectra reliably trace the CEH of the Large Magellanic Cloud obtained from resolved stellar populations in the age range 40 Myr age–metallicity relation. Moreover, the present-day total gas mass of the LMC estimated by the IRA model (6.2× {10}8 {M}ȯ ) matches within uncertainties the observed H I mass corrected for the presence of molecular gas (5.8+/- 0.5× {10}8 {M}ȯ ). We briefly discuss how our approach can be used to study SFHs of galaxies as distant as 10 Mpc at the level of detail that is currently available only in a handful of nearby Milky Way satellites. .

  18. Cluster analysis of word frequency dynamics (United States)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.


    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  19. Cluster analysis of word frequency dynamics

    International Nuclear Information System (INIS)

    Maslennikova, Yu S; Bochkarev, V V; Belashova, I A


    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations

  20. From virtual clustering analysis to self-consistent clustering analysis: a mathematical study (United States)

    Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam


    In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.


    Energy Technology Data Exchange (ETDEWEB)

    Lim, Dongwook; Han, Sang-Il; Lee, Young-Wook; Roh, Dong-Goo [Center for Galaxy Evolution Research, Yonsei University, Seoul 120-749 (Korea, Republic of); Sohn, Young-Jong [Department of Astronomy, Yonsei University, Seoul 120-749 (Korea, Republic of); Chun, Sang-Hyun [Yonsei University Observatory, Seoul 120-749 (Korea, Republic of); Lee, Jae-Woo [Department of Astronomy and Space Science, Sejong University, Seoul 143-747 (Korea, Republic of); Johnson, Christian I., E-mail: [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, MS-15, Cambridge, MA 02138 (United States)


    There is increasing evidence for the presence of multiple red giant branches (RGBs) in the color-magnitude diagrams of massive globular clusters (GCs). In order to investigate the origin of this split on the RGB, we have performed new narrow-band Ca photometry and low-resolution spectroscopy for M22, NGC 1851, and NGC 288. We find significant differences (more than 4σ) in calcium abundance from the spectroscopic HK' index for M22 and NGC 1851. We also find more than 8σ differences in CN-band strength between the Ca-strong and Ca-weak subpopulations for these GCs. For NGC 288, however, a large difference is detected only in the CN strength. The calcium abundances of RGB stars in this GC are identical to within the errors. This is consistent with the conclusion from our new Ca photometry where the RGB splits are confirmed in M22 and NGC 1851, but not in NGC 288. We also find interesting differences in the CN-CH correlations among these GCs. While CN and CH are anti-correlated in NGC 288, they show a positive correlation in M22. NGC 1851, however, shows no difference in CH between the two groups of stars with different CN strengths. We suggest that all of these systematic differences would be best explained by how strongly Type II supernovae enrichment has contributed to the chemical evolution of these GCs.

  2. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba


    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  3. An analysis of hospital brand mark clusters. (United States)

    Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan


    This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.

  4. Smartness and Italian Cities. A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Flavio Boscacci


    Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a

  5. Taxonomical analysis of the Cancer cluster of galaxies

    International Nuclear Information System (INIS)

    Perea, J.; Olmo, A. del; Moles, M.


    A description is presented of the Cancer cluster of galaxies, based on a taxonomical analysis in (α,delta, Vsub(r)) space. Earlier results by previous authors on the lack of dynamical entity of the cluster are confirmed. The present analysis points out the existence of a binary structure in the most populated region of the complex. (author)

  6. Using Cluster Analysis for Data Mining in Educational Technology Research (United States)

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.


    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  7. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis (United States)

    Hwang, Heungsun; Dillon, William R.


    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  8. Redefining the Breast Cancer Exosome Proteome by Tandem Mass Tag Quantitative Proteomics and Multivariate Cluster Analysis. (United States)

    Clark, David J; Fondrie, William E; Liao, Zhongping; Hanson, Phyllis I; Fulton, Amy; Mao, Li; Yang, Austin J


    Exosomes are microvesicles of endocytic origin constitutively released by multiple cell types into the extracellular environment. With evidence that exosomes can be detected in the blood of patients with various malignancies, the development of a platform that uses exosomes as a diagnostic tool has been proposed. However, it has been difficult to truly define the exosome proteome due to the challenge of discerning contaminant proteins that may be identified via mass spectrometry using various exosome enrichment strategies. To better define the exosome proteome in breast cancer, we incorporated a combination of Tandem-Mass-Tag (TMT) quantitative proteomics approach and Support Vector Machine (SVM) cluster analysis of three conditioned media derived fractions corresponding to a 10 000g cellular debris pellet, a 100 000g crude exosome pellet, and an Optiprep enriched exosome pellet. The quantitative analysis identified 2 179 proteins in all three fractions, with known exosomal cargo proteins displaying at least a 2-fold enrichment in the exosome fraction based on the TMT protein ratios. Employing SVM cluster analysis allowed for the classification 251 proteins as "true" exosomal cargo proteins. This study provides a robust and vigorous framework for the future development of using exosomes as a potential multiprotein marker phenotyping tool that could be useful in breast cancer diagnosis and monitoring disease progression.

  9. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A


    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...

  10. Development for analysis system of rods enrichment of nuclear fuels

    International Nuclear Information System (INIS)

    Rojas C, E.L.


    Nuclear industry is strongly regulated all over the world and quality assurance is important in every nuclear installation or process related with it. Nuclear fuel manufacture is not the exception. ININ was committed to manufacture four nuclear fuel bundles for the CFE nucleo electric station at Laguna Verde, Veracruz, under General Electric specifications and fulfilling all the requirements of this industry. One of the quality control requisites in nuclear fuel manufacture deals with the enrichment of the pellets inside the fuel bundle rods. To achieve the quality demanded in this aspect, the system described in this work was developed. With this system, developed at ININ it is possible to detect enrichment spikes since 0.4 % in a column of pellets with a 95 % confidence interval and to identify enrichment differences greater than 0.2 % e between homogeneous segments, also with a 95 % confidence interval. ININ delivered the four nuclear fuel bundles to CFE and these were introduced in the core of the nuclear reactor of Unit 1 in the fifth cycle. Nowadays they are producing energy and have shown a correct mechanical performance and neutronic behavior. (Author)

  11. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis. (United States)

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun


    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  12. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann


    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  13. Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea. (United States)

    Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun


    Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.

  14. Knowledge Enrichment Analysis for Human Tissue- Specific Genes Uncover New Biological Insights

    Directory of Open Access Journals (Sweden)

    Gong Xiu-Jun


    Full Text Available The expression and regulation of genes in different tissues are fundamental questions to be answered in biology. Knowledge enrichment analysis for tissue specific (TS and housekeeping (HK genes may help identify their roles in biological process or diseases and gain new biological insights.In this paper, we performed the knowledge enrichment analysis for 17,343 genes in 84 human tissues using Gene Set Enrichment Analysis (GSEA and Hypergeometric Analysis (HA against three biological ontologies: Gene Ontology (GO, KEGG pathways and Disease Ontology (DO respectively.The analyses results demonstrated that the functions of most gene groups are consistent with their tissue origins. Meanwhile three interesting new associations for HK genes and the skeletal muscle tissuegenes are found. Firstly, Hypergeometric analysis against KEGG database for HK genes disclosed that three disease terms (Parkinson’s disease, Huntington’s disease, Alzheimer’s disease are intensively enriched.Secondly, Hypergeometric analysis against the KEGG database for Skeletal Muscle tissue genes shows that two cardiac diseases of “Hypertrophic cardiomyopathy (HCM” and “Arrhythmogenic right ventricular cardiomyopathy (ARVC” are heavily enriched, which are also considered as no relationship with skeletal functions.Thirdly, “Prostate cancer” is intensively enriched in Hypergeometric analysis against the disease ontology (DO for the Skeletal Muscle tissue genes, which is a much unexpected phenomenon.

  15. Identification and functional analysis of endothelial tip cell-enriched genes. (United States)

    del Toro, Raquel; Prahst, Claudia; Mathivet, Thomas; Siegfried, Geraldine; Kaminker, Joshua S; Larrivee, Bruno; Breant, Christiane; Duarte, Antonio; Takakura, Nobuyuki; Fukamizu, Akiyoshi; Penninger, Josef; Eichmann, Anne


    Sprouting of developing blood vessels is mediated by specialized motile endothelial cells localized at the tips of growing capillaries. Following behind the tip cells, endothelial stalk cells form the capillary lumen and proliferate. Expression of the Notch ligand Delta-like-4 (Dll4) in tip cells suppresses tip cell fate in neighboring stalk cells via Notch signaling. In DLL4(+/-) mouse mutants, most retinal endothelial cells display morphologic features of tip cells. We hypothesized that these mouse mutants could be used to isolate tip cells and so to determine their genetic repertoire. Using transcriptome analysis of retinal endothelial cells isolated from DLL4(+/-) and wild-type mice, we identified 3 clusters of tip cell-enriched genes, encoding extracellular matrix degrading enzymes, basement membrane components, and secreted molecules. Secreted molecules endothelial-specific molecule 1, angiopoietin 2, and apelin bind to cognate receptors on endothelial stalk cells. Knockout mice and zebrafish morpholino knockdown of apelin showed delayed angiogenesis and reduced proliferation of stalk cells expressing the apelin receptor APJ. Thus, tip cells may regulate angiogenesis via matrix remodeling, production of basement membrane, and release of secreted molecules, some of which regulate stalk cell behavior.

  16. Phytotoxicity and Plant Productivity Analysis of Tar-Enriched Biochars (United States)

    Keller, M. L.; Masiello, C. A.; Dugan, B.; Rudgers, J. A.; Capareda, S. C.


    Biochar is one of the three by-products obtained by the pyrolysis of organic material, the other two being syngas and bio-oil. The pyrolysis of biomass has generated a great amount of interest in recent years as all three by-products can be put toward beneficial uses. As part of a larger project designed to evaluate the hydrologic impact of biochar soil amendment, we generated a biochar through fast pyrolysis (less than 2 minutes) of sorghum stock at 600°C. In the initial biochar production run, the char bin was not purged with nitrogen. This inadvertent change in pyrolysis conditions produced a fast-pyrolysis biochar enriched with tars. We chose not to discard this batch, however, and instead used it to test the impact of tar-enriched biochars on plants. A suite of phytotoxicity tests were run to assess the effects of tar-rich biochar on plant germination and plant productivity. We designed the experiment to test for negative effects, using an organic carbon and nutrient-rich, greenhouse- optimized potting medium instead of soil. We used Black Seeded Simpson lettuce (Lactuca sativa) as the test organism. We found that even when tars are present within biochar, biochar amendment up to 10% by weight caused increased lettuce germination rates and increased biomass productivity. In this presentation, we will report the statistical significance of our germination and biomass data, as well as present preliminary data on how biochar amendment affects soil hydrologic properties.

  17. Clustering of users of digital libraries through log file analysis

    Directory of Open Access Journals (Sweden)

    Juan Antonio Martínez-Comeche


    Full Text Available This study analyzes how users perform information retrieval tasks when introducing queries to the Hispanic Digital Library. Clusters of users are differentiated based on their distinct information behavior. The study used the log files collected by the server over a year and different possible clustering algorithms are compared. The k-means algorithm is found to be a suitable clustering method for the analysis of large log files from digital libraries. In the case of the Hispanic Digital Library the results show three clusters of users and the characteristic information behavior of each group is described.

  18. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale. (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy


    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  19. Pathway enrichment analysis approach based on topological structure and updated annotation of pathway. (United States)

    Yang, Qian; Wang, Shuyuan; Dai, Enyu; Zhou, Shunheng; Liu, Dianming; Liu, Haizhou; Meng, Qianqian; Jiang, Bin; Jiang, Wei


    Pathway enrichment analysis has been widely used to identify cancer risk pathways, and contributes to elucidating the mechanism of tumorigenesis. However, most of the existing approaches use the outdated pathway information and neglect the complex gene interactions in pathway. Here, we first reviewed the existing widely used pathway enrichment analysis approaches briefly, and then, we proposed a novel topology-based pathway enrichment analysis (TPEA) method, which integrated topological properties and global upstream/downstream positions of genes in pathways. We compared TPEA with four widely used pathway enrichment analysis tools, including database for annotation, visualization and integrated discovery (DAVID), gene set enrichment analysis (GSEA), centrality-based pathway enrichment (CePa) and signaling pathway impact analysis (SPIA), through analyzing six gene expression profiles of three tumor types (colorectal cancer, thyroid cancer and endometrial cancer). As a result, we identified several well-known cancer risk pathways that could not be obtained by the existing tools, and the results of TPEA were more stable than that of the other tools in analyzing different data sets of the same cancer. Ultimately, we developed an R package to implement TPEA, which could online update KEGG pathway information and is available at the Comprehensive R Archive Network (CRAN): © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email:



    Monika Raghuvanshi*, Rahul Patel


    In a forensic analysis, large numbers of files are examined. Much of the information comprises of in unstructured format, so it’s quite difficult task for computer forensic to perform such analysis. That’s why to do the forensic analysis of document within a limited period of time require a special approach such as document clustering. This paper review different document clustering algorithms methodologies for example K-mean, K-medoid, single link, complete link, average link in accorandance...

  1. Merging Galaxy Clusters: Analysis of Simulated Analogs (United States)

    Nguyen, Jayke; Wittman, David; Cornell, Hunter


    The nature of dark matter can be better constrained by observing merging galaxy clusters. However, uncertainty in the viewing angle leads to uncertainty in dynamical quantities such as 3-d velocities, 3-d separations, and time since pericenter. The classic timing argument links these quantities via equations of motion, but neglects effects of nonzero impact parameter (i.e. it assumes velocities are parallel to the separation vector), dynamical friction, substructure, and larger-scale environment. We present a new approach using n-body cosmological simulations that naturally incorporate these effects. By uniformly sampling viewing angles about simulated cluster analogs, we see projected merger parameters in the many possible configurations of a given cluster. We select comparable simulated analogs and evaluate the likelihood of particular merger parameters as a function of viewing angle. We present viewing angle constraints for a sample of observed mergers including the Bullet cluster and El Gordo, and show that the separation vectors are closer to the plane of the sky than previously reported.

  2. Analysis of Aspects of Innovation in a Brazilian Cluster

    Directory of Open Access Journals (Sweden)

    Adriana Valélia Saraceni


    Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.

  3. A clustering analysis of lipoprotein diameters in the metabolic syndrome

    Directory of Open Access Journals (Sweden)

    Frazier-Wood Alexis C


    Full Text Available Abstract Background The presence of smaller low-density lipoproteins (LDL has been associated with atherosclerosis risk, and the insulin resistance (IR underlying the metabolic syndrome (MetS. In addition, some research has supported the association of very low-, low- and high-density lipoprotein (VLDL HDL particle diameters with components of the metabolic syndrome (MetS, although this has been the focus of less research. We aimed to explore the relationship of VLDL, LDL and HDL diameters to MetS and its features, and by clustering individuals by their diameters of VLDL, LDL and HDL particles, to capture information across all three fractions of lipoprotein into a unified phenotype. Methods We used nuclear magnetic resonance spectroscopy measurements on fasting plasma samples from a general population sample of 1,036 adults (mean ± SD, 48.8 ± 16.2 y of age. Using latent class analysis, the sample was grouped by the diameter of their fasting lipoproteins, and mixed effects models tested whether the distribution of MetS components varied across the groups. Results Eight discrete groups were identified. Two groups (N = 251 were enriched with individuals meeting criteria for the MetS, and were characterized by the smallest LDL/HDL diameters. One of those two groups, one was additionally distinguished by large VLDL, and had significantly higher blood pressure, fasting glucose, triglycerides, and waist circumference (WC; P Conclusions While small LDL diameters remain associated with IR and the MetS, the occurrence of these in conjunction with a shift to overall larger VLDL diameter may identify those with the highest fasting glucose, TG and WC within the MetS. If replicated, the association of this phenotype with more severe IR-features indicated that it may contribute to identifying of those most at risk for incident type II diabetes and cardiometabolic disease.

  4. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL


    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  5. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis. (United States)

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost


    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  6. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo


    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  7. Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia

    Directory of Open Access Journals (Sweden)

    Nazarudin Safian


    Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS

  8. Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis (United States)

    de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.


    K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…

  9. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. (United States)

    Chen, Edward Y; Tan, Christopher M; Kou, Yan; Duan, Qiaonan; Wang, Zichen; Meirelles, Gabriela Vaz; Clark, Neil R; Ma'ayan, Avi


    System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at:

  10. Using cluster analysis to organize and explore regional GPS velocities (United States)

    Simpson, Robert W.; Thatcher, Wayne; Savage, James C.


    Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.

  11. Analysis of the performance of fuel cells PWR with a single enrichment and radial distribution of enrichments

    International Nuclear Information System (INIS)

    Vargas, S.; Gonzalez, J. A.; Alonso, G.; Del Valle, E.; Xolocostli M, J. V.


    One of the main challenges in the design of fuel assemblies is the efficient use of uranium achieving burnt homogeneous of the fuel rods as well as the burnt maximum possible of the same ones to the unload. In the case of the assemblies type PWR has been decided actually for fuel assemblies with a single radial enrichment. The present work has like effect to show the because of this decision, reason why a comparison of the neutronic performance of two fuel cells takes place with the same enrichment average but one of them with radial distribution of enrichment and the other with a single enrichment equal to the average. The results shown in the present study of the behavior of the neutron flow as well as the power distribution through of assembly sustain the because of a single radial enrichment. (Author)

  12. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li


    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  13. A Distributed Flocking Approach for Information Stream Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL


    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  14. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  15. Clustering Trajectories by Relevant Parts for Air Traffic Analysis. (United States)

    Andrienko, Gennady; Andrienko, Natalia; Fuchs, Georg; Garcia, Jose Manuel Cordero


    Clustering of trajectories of moving objects by similarity is an important technique in movement analysis. Existing distance functions assess the similarity between trajectories based on properties of the trajectory points or segments. The properties may include the spatial positions, times, and thematic attributes. There may be a need to focus the analysis on certain parts of trajectories, i.e., points and segments that have particular properties. According to the analysis focus, the analyst may need to cluster trajectories by similarity of their relevant parts only. Throughout the analysis process, the focus may change, and different parts of trajectories may become relevant. We propose an analytical workflow in which interactive filtering tools are used to attach relevance flags to elements of trajectories, clustering is done using a distance function that ignores irrelevant elements, and the resulting clusters are summarized for further analysis. We demonstrate how this workflow can be useful for different analysis tasks in three case studies with real data from the domain of air traffic. We propose a suite of generic techniques and visualization guidelines to support movement data analysis by means of relevance-aware trajectory clustering.

  16. Cluster analysis of Southeastern U.S. climate stations (United States)

    Stooksbury, D. E.; Michaels, P. J.


    A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.

  17. Socioeconomic effects of the DOE Gas Centrifuge Enrichment Plant. Volume 1: methodology and analysis

    International Nuclear Information System (INIS)


    The socioeconomic effects of the Gas Centrifuge Enrichment Plant being built in Portsmouth, Ohio were studied. Chapters are devoted to labor force, housing, population changes, economic impact, method for analysis of services, analysis of service impacts, schools, and local government finance

  18. Development of small scale cluster computer for numerical analysis (United States)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.


    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  19. Source-driven noise analysis measurements with neptunium metal reflected by high enriched uranium

    International Nuclear Information System (INIS)

    Valentine, Timothy E.; Mattingly, John K.


    Subcritical noise analysis measurements have been performed with neptunium ( 237 Np) sphere reflected by highly enriched uranium. These measurements were performed at the Los Alamos Critical Experiment Facility in December 2002 to provide an estimate of the subcriticality of 237 Np reflected by various amounts of high-enriched uranium. This paper provides a description of the measurements and presents some preliminary results of the analysis of the measurements. The measured and calculated spectral ratios differ by 15% whereas the 'interpreted' and calculated k eff values differ by approximately 1%. (author)

  20. Predicting healthcare outcomes in prematurely born infants using cluster analysis. (United States)

    MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne


    Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.

  1. Modeling and Analysis Methods for an On-line Enrichment Monitor

    Energy Technology Data Exchange (ETDEWEB)

    Smith, Leon E.; Jarman, Kenneth D.; Wittman, Richard S.; Zalavadia, Mital A.; March-Leuba, Jose A.


    The International Atomic Energy Agency (IAEA) has developed an On-Line Enrichment Monitor (OLEM) as one possible component in a new generation of safeguards measures for uranium enrichment plants. The OLEM measures 235U emissions from the UF6 gas flowing through a unit header pipe using NaI(Tl) spectrometers, and corrects for gas density changes using pressure and temperature sensors in order to determine the enrichment of the gas as a function of time. In parallel with the OLEM instrument development, a Virtual OLEM (VOLEM) software tool has been developed that is capable of producing synthetic gamma-ray, pressure, and temperature data representative of a wide range of enrichment plant operating conditions. VOLEM complements instrument development activities and allows the study of OLEM for scenarios that will be difficult or impossible to evaluate empirically. Uses of VOLEM include: investigation of hardware design options; inter-comparison of candidate gamma-ray spectral analysis and enrichment estimation algorithms; uncertainty budget analysis and performance prediction for typical and atypical operational scenarios; and testing of the OLEM data acquisition, analysis and reporting software. This paper describes the technical foundations of VOLEM and illustrates how it can be used. An overview of the nominal instrument design and deployment scenario for OLEM is provided, with emphasis on the key online-assay measurement challenge: accurately determining the portion of the total 235U signal that comes from a background that includes solid uranium deposits on the piping walls. Monte Carlo modeling tools, data analysis algorithms and uncertainty quantification methods are described. VOLEM is then used to quantitatively explore the uncertainty budgets and predicted instrument performance for a plausible range of typical plant operating parameters, and one set of candidate analysis algorithms. Additionally, a series of VOLEM case studies illustrates how an online

  2. Comprehensive assessment of sequence variation within the copy number variable defensin cluster on 8p23 by target enriched in-depth 454 sequencing

    Directory of Open Access Journals (Sweden)

    Zhang Xinmin


    Full Text Available Abstract Background In highly copy number variable (CNV regions such as the human defensin gene locus, comprehensive assessment of sequence variations is challenging. PCR approaches are practically restricted to tiny fractions, and next-generation sequencing (NGS approaches of whole individual genomes e.g. by the 1000 Genomes Project is confined by an affordable sequence depth. Combining target enrichment with NGS may represent a feasible approach. Results As a proof of principle, we enriched a ~850 kb section comprising the CNV defensin gene cluster DEFB, the invariable DEFA part and 11 control regions from two genomes by sequence capture and sequenced it by 454 technology. 6,651 differences to the human reference genome were found. Comparison to HapMap genotypes revealed sensitivities and specificities in the range of 94% to 99% for the identification of variations. Using error probabilities for rigorous filtering revealed 2,886 unique single nucleotide variations (SNVs including 358 putative novel ones. DEFB CN determinations by haplotype ratios were in agreement with alternative methods. Conclusion Although currently labor extensive and having high costs, target enriched NGS provides a powerful tool for the comprehensive assessment of SNVs in highly polymorphic CNV regions of individual genomes. Furthermore, it reveals considerable amounts of putative novel variations and simultaneously allows CN estimation.

  3. Cluster analysis of radionuclide concentrations in beach sand

    NARCIS (Netherlands)

    de Meijer, R.J.; James, I.; Jennings, P.J.; Keoyers, J.E.

    This paper presents a method in which natural radionuclide concentrations of beach sand minerals are traced along a stretch of coast by cluster analysis. This analysis yields two groups of mineral deposit with different origins. The method deviates from standard methods of following dispersal of

  4. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis (United States)

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan


    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  5. Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis (United States)

    Sinyor, Mark; Schaffer, Ayal; Streiner, David L


    Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321

  6. Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

    Directory of Open Access Journals (Sweden)

    Wessel Jens


    Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.

  7. Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis

    Directory of Open Access Journals (Sweden)

    Gabjo Kim


    Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.

  8. An enhanced search algorithm for the charged fuel enrichment in equilibrium cycle analysis of REBUS-3

    International Nuclear Information System (INIS)

    Park, Tongkyu; Yang, Won Sik; Kim, Sang-Ji


    Highlights: • An enhanced search algorithm for charged fuel enrichment was developed for equilibrium cycle analysis with REBUS-3. • The new search algorithm is not sensitive to the user-specified initial guesses. • The new algorithm reduces the computational time by a factor of 2–3. - Abstract: This paper presents an enhanced search algorithm for the charged fuel enrichment in equilibrium cycle analysis of REBUS-3. The current enrichment search algorithm of REBUS-3 takes a large number of iterations to yield a converged solution or even terminates without a converged solution when the user-specified initial guesses are far from the solution. To resolve the convergence problem and to reduce the computational time, an enhanced search algorithm was developed. The enhanced algorithm is based on the idea of minimizing the number of enrichment estimates by allowing drastic enrichment changes and by optimizing the current search algorithm of REBUS-3. Three equilibrium cycle problems with recycling, without recycling and of high discharge burnup were defined and a series of sensitivity analyses were performed with a wide range of user-specified initial guesses. Test results showed that the enhanced search algorithm is able to produce a converged solution regardless of the initial guesses. In addition, it was able to reduce the number of flux calculations by a factor of 2.9, 1.8, and 1.7 for equilibrium cycle problems with recycling, without recycling, and of high discharge burnup, respectively, compared to the current search algorithm.


    Directory of Open Access Journals (Sweden)

    Roman Shchur


    Full Text Available   SWOT-analysis of the threats and benefits of innovation development strategy of Ivano-Frankivsk region in the context of financial support was сonducted. Methodical approach to determine of public-private partnerships potential that is tool of innovative economic development financing was identified. Cluster analysis of possibilities of forming public-private partnership in a particular region was carried out. Optimal set of problem areas that require urgent solutions and financial security is defined on the basis of cluster approach. It will help to form practical recommendations for the formation of an effective financial mechanism in the regions of Ukraine. Key words: the mechanism of innovation development financial provision, innovation development, public-private partnerships, cluster analysis, innovative development strategy.

  10. Cluster-based analysis of multi-model climate ensembles (United States)

    Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.


    Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and

  11. Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

    Directory of Open Access Journals (Sweden)

    Andrew Williams


    previously defined, functionally relevant gene sets, the present study also identified two novel genes sets: a gene set associated with pulmonary fibrosis and a gene set associated with ROS, underlining the advantage of using a data-driven approach to identify novel, functionally related gene sets. The results can be used in future gene set enrichment analysis studies involving NMs or as features for clustering and classifying NMs of diverse properties.

  12. Systematic enrichment analysis of gene expression profiling studies identifies consensus pathways implicated in colorectal cancer development

    Directory of Open Access Journals (Sweden)

    Jesús Lascorz


    Full Text Available Background: A large number of gene expression profiling (GEP studies on colorectal carcinogenesis have been performed but no reliable gene signature has been identified so far due to the lack of reproducibility in the reported genes. There is growing evidence that functionally related genes, rather than individual genes, contribute to the etiology of complex traits. We used, as a novel approach, pathway enrichment tools to define functionally related genes that are consistently up- or down-regulated in colorectal carcinogenesis. Materials and Methods: We started the analysis with 242 unique annotated genes that had been reported by any of three recent meta-analyses covering GEP studies on genes differentially expressed in carcinoma vs normal mucosa. Most of these genes (218, 91.9% had been reported in at least three GEP studies. These 242 genes were submitted to bioinformatic analysis using a total of nine tools to detect enrichment of Gene Ontology (GO categories or Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. As a final consistency criterion the pathway categories had to be enriched by several tools to be taken into consideration. Results: Our pathway-based enrichment analysis identified the categories of ribosomal protein constituents, extracellular matrix receptor interaction, carbonic anhydrase isozymes, and a general category related to inflammation and cellular response as significantly and consistently overrepresented entities. Conclusions: We triaged the genes covered by the published GEP literature on colorectal carcinogenesis and subjected them to multiple enrichment tools in order to identify the consistently enriched gene categories. These turned out to have known functional relationships to cancer development and thus deserve further investigation.

  13. Application of microarray analysis on computer cluster and cloud platforms. (United States)

    Bernau, C; Boulesteix, A-L; Knaus, J


    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  14. Analysis of the production of U3O8 powder for low enrichment fuel plates

    International Nuclear Information System (INIS)

    Boero, N.L.; Celora, J.; Parodi, C.A.; Ponieman, G.; Kellner, M.; Marajofsky, A.


    Description is made of the processes used in the production of U 3 O 8 powder for low enrichment plates for fuel elements for Research Reactors. The analysis of the efficiency of each batch is foccused on the relationship between milling and sieving times and the morphology of the product in each production step. (Author)

  15. Cluster Analysis as an Analytical Tool of Population Policy

    Directory of Open Access Journals (Sweden)

    Oksana Mikhaylovna Shubat


    Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.

  16. Automated analysis of organic particles using cluster SIMS

    Energy Technology Data Exchange (ETDEWEB)

    Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike


    Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF{sub 5}{sup +} or C{sub 8}{sup -}) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.

  17. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal


    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  18. application of single-linkage clustering method in the analysis of ...

    African Journals Online (AJOL)


    ANALYSIS OF GROWTH RATE OF GROSS DOMESTIC PRODUCT. (GDP) AT ... The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are ..... Number of cluster sum from from observations of ...

  19. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups (United States)

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José


    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  20. Transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis

    Directory of Open Access Journals (Sweden)

    Riccardi Giovanna


    Full Text Available Abstract Background The ESAT-6 (early secreted antigenic target, 6 kDa family collects small mycobacterial proteins secreted by Mycobacterium tuberculosis, particularly in the early phase of growth. There are 23 ESAT-6 family members in M. tuberculosis H37Rv. In a previous work, we identified the Zur- dependent regulation of five proteins of the ESAT-6/CFP-10 family (esxG, esxH, esxQ, esxR, and esxS. esxG and esxH are part of ESAT-6 cluster 3, whose expression was already known to be induced by iron starvation. Results In this research, we performed EMSA experiments and transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis (msmeg0615-msmeg0625 and M. tuberculosis. In contrast to what we had observed in M. tuberculosis, we found that in M. smegmatis ESAT-6 cluster 3 responds only to iron and not to zinc. In both organisms we identified an internal promoter, a finding which suggests the presence of two transcriptional units and, by consequence, a differential expression of cluster 3 genes. We compared the expression of msmeg0615 and msmeg0620 in different growth and stress conditions by means of relative quantitative PCR. The expression of msmeg0615 and msmeg0620 genes was essentially similar; they appeared to be repressed in most of the tested conditions, with the exception of acid stress (pH 4.2 where msmeg0615 was about 4-fold induced, while msmeg0620 was repressed. Analysis revealed that in acid stress conditions M. tuberculosis rv0282 gene was 3-fold induced too, while rv0287 induction was almost insignificant. Conclusion In contrast with what has been reported for M. tuberculosis, our results suggest that in M. smegmatis only IdeR-dependent regulation is retained, while zinc has no effect on gene expression. The role of cluster 3 in M. tuberculosis virulence is still to be defined; however, iron- and zinc-dependent expression strongly suggests that cluster 3 is highly expressed in the infective process, and that the cluster

  1. Graph analysis of cell clusters forming vascular networks (United States)

    Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.


    This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.

  2. clusters

    Indian Academy of Sciences (India)


    Sep 27, 2017 ... Author for correspondence ( MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...

  3. clusters

    Indian Academy of Sciences (India)

    environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.

  4. Cluster Analysis of International Information and Social Development. (United States)

    Lau, Jesus


    Analyzes information activities in relation to socioeconomic characteristics in low, middle, and highly developed economies for the years 1960 and 1977 through the use of cluster analysis. Results of data from 31 countries suggest that information development is achieved mainly by countries that have also achieved social development. (26…

  5. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes (United States)

    Pell, Tony; Hargreaves, Linda


    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  6. Cluster analysis for validated climatology stations using precipitation in Mexico

    NARCIS (Netherlands)

    Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.


    Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40

  7. A Cluster Analysis of Personality Style in Adults with ADHD (United States)

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita


    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  8. Characterization of population exposure to organochlorines: A cluster analysis application

    NARCIS (Netherlands)

    R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)


    textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues

  9. Robustness in cluster analysis in the presence of anomalous observations

    NARCIS (Netherlands)

    Zhuk, EE

    Cluster analysis of multivariate observations in the presence of "outliers" (anomalous observations) in a sample is studied. The expected (mean) fraction of erroneous decisions for the decision rule is computed analytically by minimizing the intraclass scatter. A robust decision rule (stable to

  10. Language Learner Motivational Types: A Cluster Analysis Study (United States)

    Papi, Mostafa; Teimouri, Yasser


    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  11. Cluster analysis as a prediction tool for pregnancy outcomes. (United States)

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L


    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  12. Performance Analysis of Unsupervised Clustering Methods for Brain Tumor Segmentation

    Directory of Open Access Journals (Sweden)

    Tushar H Jaware


    Full Text Available Medical image processing is the most challenging and emerging field of neuroscience. The ultimate goal of medical image analysis in brain MRI is to extract important clinical features that would improve methods of diagnosis & treatment of disease. This paper focuses on methods to detect & extract brain tumour from brain MR images. MATLAB is used to design, software tool for locating brain tumor, based on unsupervised clustering methods. K-Means clustering algorithm is implemented & tested on data base of 30 images. Performance evolution of unsupervised clusteringmethods is presented.

  13. Identifying clinical course patterns in SMS data using cluster analysis

    DEFF Research Database (Denmark)

    Kent, Peter; Kongsted, Alice


    ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there are alternative ways of managing SMS data and many different methods...

  14. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    Directory of Open Access Journals (Sweden)

    Jessie J Hsu

    Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  15. Measurement system analysis (MSA) of the isotopic ratio for uranium isotope enrichment process control

    Energy Technology Data Exchange (ETDEWEB)

    Medeiros, Josue C. de; Barbosa, Rodrigo A.; Carnaval, Joao Paulo R., E-mail:, E-mail:, E-mail: [Industrias Nucleares do Brasil (INB), Rezende, RJ (Brazil)


    Currently, one of the stages in nuclear fuel cycle development is the process of uranium isotope enrichment, which will provide the amount of low enriched uranium for the nuclear fuel production to supply 100% Angra 1 and 20% Angra 2 demands. Determination of isotopic ration n({sup 235}U)/n({sup 238}U) in uranium hexafluoride (UF{sub 6} - used as process gas) is essential in order to control of enrichment process of isotopic separation by gaseous centrifugation cascades. The uranium hexafluoride process is performed by gas continuous feeding in separation unit which uses the centrifuge force principle, establishing a density gradient in a gas containing components of different molecular weights. The elemental separation effect occurs in a single ultracentrifuge that results in a partial separation of the feed in two fractions: an enriched on (product) and another depleted (waste) in the desired isotope ({sup 235}UF{sub 6}). Industrias Nucleares do Brasil (INB) has used quadrupole mass spectrometry (QMS) by electron impact (EI) to perform isotopic ratio n({sup 235}U)/n({sup 238}U) analysis in the process. The decision of adjustments and change te input variables are based on the results presented in these analysis. A study of stability, bias and linearity determination has been performed in order to evaluate the applied method, variations and systematic errors in the measurement system. The software used to analyze the techniques above was the Minitab 15. (author)

  16. High-dimensional cluster analysis with the Masked EM Algorithm (United States)

    Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.


    Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694

  17. A cluster analysis investigation of workaholism as a syndrome. (United States)

    Aziz, Shahnaz; Zickar, Michael J


    Workaholism has been conceptualized as a syndrome although there have been few tests that explicitly consider its syndrome status. The authors analyzed a three-dimensional scale of workaholism developed by Spence and Robbins (1992) using cluster analysis. The authors identified three clusters of individuals, one of which corresponded to Spence and Robbins's profile of the workaholic (high work involvement, high drive to work, low work enjoyment). Consistent with previously conjectured relations with workaholism, individuals in the workaholic cluster were more likely to label themselves as workaholics, more likely to have acquaintances label them as workaholics, and more likely to have lower life satisfaction and higher work-life imbalance. The importance of considering workaholism as a syndrome and the implications for effective interventions are discussed. Copyright 2006 APA.

  18. Cosmological analysis of galaxy clusters surveys in X-rays

    International Nuclear Information System (INIS)

    Clerc, N.


    Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr

  19. Cluster analysis by optimal decomposition of induced fuzzy sets

    Energy Technology Data Exchange (ETDEWEB)

    Backer, E


    Nonsupervised pattern recognition is addressed and the concept of fuzzy sets is explored in order to provide the investigator (data analyst) additional information supplied by the pattern class membership values apart from the classical pattern class assignments. The basic ideas behind the pattern recognition problem, the clustering problem, and the concept of fuzzy sets in cluster analysis are discussed, and a brief review of the literature of the fuzzy cluster analysis is given. Some mathematical aspects of fuzzy set theory are briefly discussed; in particular, a measure of fuzziness is suggested. The optimization-clustering problem is characterized. Then the fundamental idea behind affinity decomposition is considered. Next, further analysis takes place with respect to the partitioning-characterization functions. The iterative optimization procedure is then addressed. The reclassification function is investigated and convergence properties are examined. Finally, several experiments in support of the method suggested are described. Four object data sets serve as appropriate test cases. 120 references, 70 figures, 11 tables. (RWR)

  20. Integrated Information Technology Framework for Analysis of Data from Enrichment Plants to Support the Safeguards Mission

    International Nuclear Information System (INIS)

    Marr, Clifton T.; Thurman, David A.; Jorgensen, Bruce V.


    Many examples of software architectures exist that support process monitoring and analysis applications which could be applied to enrichment plants in a fashion that supports the Safeguards Mission. Pacific Northwest National Laboratory (PNNL) has developed mature solutions that will provide the framework to support online statistical analysis of enrichment plans and the entire nuclear fuel cycle. Most recently, PNNL has developed a refined architecture and supporting tools that address many of the common problems analysis and modeling environments experience: pipelining, handling large data volumes, and real-time performance. We propose the architecture and tools may be successfully used in furthering the goals of nuclear material control and accountability as both an aid to processing plant owners and as comprehensive monitoring for oversight teams.

  1. Safeguarding uranium enrichment facilities. Review and analysis of the status of safeguards technology for uranium enrichment facilities

    International Nuclear Information System (INIS)


    The objective of this paper is to examine critically the diversion potential at uranium enrichment facilities and to outline a basic safeguards strategy which counters all identified hazards as completely as possible yet with a minimum of non-essential redundancy. Where existing technology does not appear to be adequate for effective safeguards, the limitations are examined, and suggestions for further R and D effort are made. Parts of this report are generally applicable to all currently known enrichment processes, while other parts are specifically directed toward facilities based on the gas centrifuge process. It is hoped that additional sections discussing a safeguards strategy for gas diffusion facilities can be added later. It should be emphasized that this is a technical report, and does not reflect any legal positions. The safeguards strategy and subsequent inspection procedures are intended as guidelines, not as negotiating positions

  2. DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

    Directory of Open Access Journals (Sweden)

    Alexander Chailytko


    Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.

  3. Analysis of RXTE data on Clusters of Galaxies (United States)

    Petrosian, Vahe


    This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument

  4. Electrochemical and genomic analysis of novel electroactive isolates obtained via potentiostatic enrichment from tropical sediment (United States)

    Doyle, Lucinda E.; Yung, Pui Yi; Mitra, Sumitra D.; Wuertz, Stefan; Williams, Rohan B. H.; Lauro, Federico M.; Marsili, Enrico


    Enrichment of electrochemically-active microorganisms (EAM) to date has mostly relied on microbial fuel cells fed with wastewater. This study aims to enrich novel EAM by exposing tropical sediment, not frequently reported in the literature, to sustained anodic potentials. Voltamperometric techniques and electrochemical impedance spectroscopy, performed over a wide range of potentials, characterise extracellular electron transfer (EET) over time. Applied potential is found to affect biofilm electrochemical signature. Geobacter metallireducens is heavily enriched on the electrodes, as determined by metagenomic and metatranscriptomic analysis, in the first report of the species in a lactate-fed system. Two novel isolates are grown in pure culture from the enrichment, identified by 16S rRNA gene sequencing as Aeromonas and Enterobacter, respectively. The names proposed are Aeromonas sp. CL-1 and Enterobacter sp. EA-1. Both isolates are capable of EET on carbon felt and screen-printed carbon electrodes without the addition of exogenous redox mediators. Enterobacter sp. EA-1 can also perform mediated electron transfer using the soluble redox mediator 2-hydroxy-1,4-naphthoquinone (HNQ). Both isolates are able to use acetate and lactate as electron donors. This work outlines a comprehensive methodology for characterising novel EAM from unconventional inocula.

  5. Constraints on the Parental Melts of Enriched Shergottites from Image Analysis and High Pressure Experiments (United States)

    Collinet, M.; Medard, E.; Devouard, B.; Peslier, A.


    Martian basalts can be classified in at least two geochemically different families: enriched and depleted shergottites. Enriched shergottites are characterized by higher incompatible element concentrations and initial Sr-87/Sr-86 and lower initial Nd-143/Nd-144 and Hf-176/Hf-177 than depleted shergottites [e.g. 1, 2]. It is now generally admitted that shergottites result from the melting of at least two distinct mantle reservoirs [e.g. 2, 3]. Some of the olivine-phyric shergottites (either depleted or enriched), the most magnesian Martian basalts, could represent primitive melts, which are of considerable interest to constrain mantle sources. Two depleted olivine-phyric shergottites, Yamato (Y) 980459 and Northwest Africa (NWA) 5789, are in equilibrium with their most magnesian olivine (Fig. 1) and their bulk rock compositions are inferred to represent primitive melts [4, 5]. Larkman Nunatak (LAR) 06319 [3, 6, 7] and NWA 1068 [8], the most magnesian enriched basalts, have bulk Mg# that are too high to be in equilibrium with their olivine megacryst cores. Parental melt compositions have been estimated by subtracting the most magnesian olivine from the bulk rock composition, assuming that olivine megacrysts have partially accumulated [3, 9]. However, because this technique does not account for the actual petrography of these meteorites, we used image analysis to study these rocks history, reconstruct their parent magma and understand the nature of olivine megacrysts.

  6. Mobility in Europe: Recent Trends from a Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ioana Manafi


    Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.

  7. Full text clustering and relationship network analysis of biomedical publications.

    Directory of Open Access Journals (Sweden)

    Renchu Guan

    Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  8. The Productivity Analysis of Chennai Automotive Industry Cluster (United States)

    Bhaskaran, E.


    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  9. Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis. (United States)

    Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz


    Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  10. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum. (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia


    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  11. Full text clustering and relationship network analysis of biomedical publications. (United States)

    Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu


    Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  12. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam


    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  13. The Quantitative Analysis of Chennai Automotive Industry Cluster (United States)

    Bhaskaran, Ethirajan


    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  14. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin


    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  15. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches. (United States)

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C


    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  16. An integrated microfluidic analysis microsystems with bacterial capture enrichment and in-situ impedance detection (United States)

    Liu, Hai-Tao; Wen, Zhi-Yu; Xu, Yi; Shang, Zheng-Guo; Peng, Jin-Lan; Tian, Peng


    In this paper, an integrated microfluidic analysis microsystems with bacterial capture enrichment and in-situ impedance detection was purposed based on microfluidic chips dielectrophoresis technique and electrochemical impedance detection principle. The microsystems include microfluidic chip, main control module, and drive and control module, and signal detection and processing modulet and result display unit. The main control module produce the work sequence of impedance detection system parts and achieve data communication functions, the drive and control circuit generate AC signal which amplitude and frequency adjustable, and it was applied on the foodborne pathogens impedance analysis microsystems to realize the capture enrichment and impedance detection. The signal detection and processing circuit translate the current signal into impendence of bacteria, and transfer to computer, the last detection result is displayed on the computer. The experiment sample was prepared by adding Escherichia coli standard sample into chicken sample solution, and the samples were tested on the dielectrophoresis chip capture enrichment and in-situ impedance detection microsystems with micro-array electrode microfluidic chips. The experiments show that the Escherichia coli detection limit of microsystems is 5 × 104 CFU/mL and the detection time is within 6 min in the optimization of voltage detection 10 V and detection frequency 500 KHz operating conditions. The integrated microfluidic analysis microsystems laid the solid foundation for rapid real-time in-situ detection of bacteria.

  17. Statistical analysis of the spatial distribution of galaxies and clusters

    International Nuclear Information System (INIS)

    Cappi, Alberto


    This thesis deals with the analysis of the distribution of galaxies and clusters, describing some observational problems and statistical results. First chapter gives a theoretical introduction, aiming to describe the framework of the formation of structures, tracing the history of the Universe from the Planck time, t_p = 10"-"4"3 sec and temperature corresponding to 10"1"9 GeV, to the present epoch. The most usual statistical tools and models of the galaxy distribution, with their advantages and limitations, are described in chapter two. A study of the main observed properties of galaxy clustering, together with a detailed statistical analysis of the effects of selecting galaxies according to apparent magnitude or diameter, is reported in chapter three. Chapter four delineates some properties of groups of galaxies, explaining the reasons of discrepant results on group distributions. Chapter five is a study of the distribution of galaxy clusters, with different statistical tools, like correlations, percolation, void probability function and counts in cells; it is found the same scaling-invariant behaviour of galaxies. Chapter six describes our finding that rich galaxy clusters too belong to the fundamental plane of elliptical galaxies, and gives a discussion of its possible implications. Finally chapter seven reviews the possibilities offered by multi-slit and multi-fibre spectrographs, and I present some observational work on nearby and distant galaxy clusters. In particular, I show the opportunities offered by ongoing surveys of galaxies coupled with multi-object fibre spectrographs, focusing on the ESO Key Programme A galaxy redshift survey in the south galactic pole region to which I collaborate and on MEFOS, a multi-fibre instrument with automatic positioning. Published papers related to the work described in this thesis are reported in the last appendix. (author) [fr

  18. Network Expansion and Pathway Enrichment Analysis towards Biologically Significant Findings from Microarrays

    Directory of Open Access Journals (Sweden)

    Wu Xiaogang


    Full Text Available In many cases, crucial genes show relatively slight changes between groups of samples (e.g. normal vs. disease, and many genes selected from microarray differential analysis by measuring the expression level statistically are also poorly annotated and lack of biological significance. In this paper, we present an innovative approach - network expansion and pathway enrichment analysis (NEPEA for integrative microarray analysis. We assume that organized knowledge will help microarray data analysis in significant ways, and the organized knowledge could be represented as molecular interaction networks or biological pathways. Based on this hypothesis, we develop the NEPEA framework based on network expansion from the human annotated and predicted protein interaction (HAPPI database, and pathway enrichment from the human pathway database (HPD. We use a recently-published microarray dataset (GSE24215 related to insulin resistance and type 2 diabetes (T2D as case study, since this study provided a thorough experimental validation for both genes and pathways identified computationally from classical microarray analysis and pathway analysis. We perform our NEPEA analysis for this dataset based on the results from the classical microarray analysis to identify biologically significant genes and pathways. Our findings are not only consistent with the original findings mostly, but also obtained more supports from other literatures.

  19. Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis. (United States)

    Ben-Sasson, Ayelet; Podoly, Tamar Yonit


    Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a

  20. Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations

    Directory of Open Access Journals (Sweden)

    F. Darrouzet


    Full Text Available Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles have been derived from the electron plasma frequency identified by the WHISPER sounder supplemented, in-between soundings, by relative variations of the spacecraft potential measured by the electric field instrument EFW; ion velocity is also measured onboard these satellites. The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 min; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations of three plume events and compare CLUSTER in-situ data with global images of the plasmasphere obtained by IMAGE. In particular, we study the geometry and the orientation of plasmaspheric plumes by using four-point analysis methods. We compare several aspects of plume motion as determined by different methods: (i inner and outer plume boundary velocity calculated from time delays of this boundary as observed by the wave experiment WHISPER on the four spacecraft, (ii drift velocity measured by the electron drift instrument EDI onboard CLUSTER and (iii global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.


    International Nuclear Information System (INIS)

    Jogesh Babu, G.; Chattopadhyay, Tanuka; Chattopadhyay, Asis Kumar; Mondal, Saptarshi


    The proper interpretation of horizontal branch (HB) morphology is crucial to the understanding of the formation history of stellar populations. In the present study a multivariate analysis is used (principal component analysis) for the selection of appropriate HB morphology parameter, which, in our case, is the logarithm of effective temperature extent of the HB (log T effHB ). Then this parameter is expressed in terms of the most significant observed independent parameters of Galactic globular clusters (GGCs) separately for coherent groups, obtained in a previous work, through a stepwise multiple regression technique. It is found that, metallicity ([Fe/H]), central surface brightness (μ v ), and core radius (r c ) are the significant parameters to explain most of the variations in HB morphology (multiple R 2 ∼ 0.86) for GGC elonging to the bulge/disk while metallicity ([Fe/H]) and absolute magnitude (M v ) are responsible for GGC belonging to the inner halo (multiple R 2 ∼ 0.52). The robustness is tested by taking 1000 bootstrap samples. A cluster analysis is performed for the red giant branch (RGB) stars of the GGC belonging to Galactic inner halo (Cluster 2). A multi-episodic star formation is preferred for RGB stars of GGC belonging to this group. It supports the asymptotic giant branch (AGB) model in three episodes instead of two as suggested by Carretta et al. for halo GGC while AGB model is suggested to be revisited for bulge/disk GGC.

  2. Radiological analysis of plutonium glass batches with natural/enriched boron

    International Nuclear Information System (INIS)

    Rainisch, R.


    The disposition of surplus plutonium inventories by the US Department of Energy (DOE) includes the immobilization of certain plutonium materials in a borosilicate glass matrix, also referred to as vitrification. This paper addresses source terms of plutonium masses immobilized in a borosilicate glass matrix where the glass components include both natural boron and enriched boron. The calculated source terms pertain to neutron and gamma source strength (particles per second), and source spectrum changes. The calculated source terms corresponding to natural boron and enriched boron are compared to determine the benefits (decrease in radiation source terms) for to the use of enriched boron. The analysis of plutonium glass source terms shows that a large component of the neutron source terms is due to (a, n) reactions. The Americium-241 and plutonium present in the glass emit alpha particles (a). These alpha particles interact with low-Z nuclides like B-11, B-10, and O-17 in the glass to produce neutrons. The low-Z nuclides are referred to as target particles. The reference glass contains 9.4 wt percent B 2 O 3 . Boron-11 was found to strongly support the (a, n) reactions in the glass matrix. B-11 has a natural abundance of over 80 percent. The (a, n) reaction rates for B-10 are lower than for B-11 and the analysis shows that the plutonium glass neutron source terms can be reduced by artificially enriching natural boron with B-10. The natural abundance of B-10 is 19.9 percent. Boron enriched to 96-wt percent B-10 or above can be obtained commercially. Since lower source terms imply lower dose rates to radiation workers handling the plutonium glass materials, it is important to know the achievable decrease in source terms as a result of boron enrichment. Plutonium materials are normally handled in glove boxes with shielded glass windows and the work entails both extremity and whole-body exposures. Lowering the source terms of the plutonium batches will make the handling

  3. Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio. (United States)

    Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh


    Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to

  4. Problems of accounting, cost concerns and economic analysis in the mining enrichment industry

    Energy Technology Data Exchange (ETDEWEB)

    Slabinskiy, V T


    Mining enrichment enterprises of the ferrous and nonferrous metallurgy, coal and chemical industry have much in common in the area of technology of production, technical base, organization of labor and production. This in turn presupposes the possible development of a common procedure of accounting of expenditures for production, calculation of net cost of output and analysis of production-economic activity of enterprises. Based on scientific research and generalization of advanced experience of practical workers, means of improvement of economic operation in mining enrichment enterprises are outlined according to increasing demands of production control. An outline of analytic accounting of expenditures which provides for multitarget use of information has been developed: for organization of operational control of the formulation of net cost of output, determination of the results of self support activities of structural subdivisions of an enterprise, computation of the efficiency of scientific and technical progress. Experience of use of economic and mathematical methods in computers for this purpose is discussed.

  5. Performance Based Clustering for Benchmarking of Container Ports: an Application of Dea and Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Jie Wu


    Full Text Available The operational performance of container ports has received more and more attentions in both academic and practitioner circles, the performance evaluation and process improvement of container ports have also been the focus of several studies. In this paper, Data Envelopment Analysis (DEA, an effective tool for relative efficiency assessment, is utilized for measuring the performances and benchmarking of the 77 world container ports in 2007. The used approaches in the current study consider four inputs (Capacity of Cargo Handling Machines, Number of Berths, Terminal Area and Storage Capacity and a single output (Container Throughput. The results for the efficiency scores are analyzed, and a unique ordering of the ports based on average cross efficiency is provided, also cluster analysis technique is used to select the more appropriate targets for poorly performing ports to use as benchmarks.

  6. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis (United States)

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao


    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  7. Diagnostics of subtropical plants functional state by cluster analysis

    Directory of Open Access Journals (Sweden)

    Oksana Belous


    Full Text Available The article presents an application example of statistical methods for data analysis on diagnosis of the adaptive capacity of subtropical plants varieties. We depicted selection indicators and basic physiological parameters that were defined as diagnostic. We used evaluation on a set of parameters of water regime, there are: determination of water deficit of the leaves, determining the fractional composition of water and detection parameters of the concentration of cell sap (CCS (for tea culture flushes. These settings are characterized by high liability and high responsiveness to the effects of many abiotic factors that determined the particular care in the selection of plant material for analysis and consideration of the impact on sustainability. On the basis of the experimental data calculated the coefficients of pair correlation between climatic factors and used physiological indicators. The result was a selection of physiological and biochemical indicators proposed to assess the adaptability and included in the basis of methodical recommendations on diagnostics of the functional state of the studied cultures. Analysis of complex studies involving a large number of indicators is quite difficult, especially does not allow to quickly identify the similarity of new varieties for their adaptive responses to adverse factors, and, therefore, to set general requirements to conditions of cultivation. Use of cluster analysis suggests that in the analysis of only quantitative data; define a set of variables used to assess varieties (and the more sampling, the more accurate the clustering will happen, be sure to ascertain the measure of similarity (or difference between objects. It is shown that the identification of diagnostic features, which are subjected to statistical processing, impact the accuracy of the varieties classification. Selection in result of the mono-clusters analysis (variety tea Kolhida; hazelnut Lombardsky red; variety kiwi Monty

  8. Cluster analysis for DNA methylation profiles having a detection threshold

    Directory of Open Access Journals (Sweden)

    Siegmund Kimberly D


    Full Text Available Abstract Background DNA methylation, a molecular feature used to investigate tumor heterogeneity, can be measured on many genomic regions using the MethyLight technology. Due to the combination of the underlying biology of DNA methylation and the MethyLight technology, the measurements, while being generated on a continuous scale, have a large number of 0 values. This suggests that conventional clustering methodology may not perform well on this data. Results We compare performance of existing methodology (such as k-means with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation. Conclusion The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.

  9. Cluster Analysis of the International Stellarator Confinement Database

    International Nuclear Information System (INIS)

    Kus, A.; Dinklage, A.; Preuss, R.; Ascasibar, E.; Harris, J. H.; Okamura, S.; Yamada, H.; Sano, F.; Stroth, U.; Talmadge, J.


    Heterogeneous structure of collected data is one of the problems that occur during derivation of scalings for energy confinement time, and whose analysis tourns out to be wide and complicated matter. The International Stellarator Confinement Database [1], shortly ISCDB, comprises in its latest version 21 a total of 3647 observations from 8 experimental devices, 2067 therefrom beeing so far completed for upcoming analyses. For confinement scaling studies 1933 observation were chosen as the standard dataset. Here we describe a statistical method of cluster analysis for identification of possible cohesive substructures in ISDCB and present some preliminary results

  10. Accommodating error analysis in comparison and clustering of molecular fingerprints. (United States)

    Salamon, H; Segal, M R; Ponce de Leon, A; Small, P M


    Molecular epidemiologic studies of infectious diseases rely on pathogen genotype comparisons, which usually yield patterns comprising sets of DNA fragments (DNA fingerprints). We use a highly developed genotyping system, IS6110-based restriction fragment length polymorphism analysis of Mycobacterium tuberculosis, to develop a computational method that automates comparison of large numbers of fingerprints. Because error in fragment length measurements is proportional to fragment length and is positively correlated for fragments within a lane, an align-and-count method that compensates for relative scaling of lanes reliably counts matching fragments between lanes. Results of a two-step method we developed to cluster identical fingerprints agree closely with 5 years of computer-assisted visual matching among 1,335 M. tuberculosis fingerprints. Fully documented and validated methods of automated comparison and clustering will greatly expand the scope of molecular epidemiology.

  11. Accident patterns for construction-related workers: a cluster analysis (United States)

    Liao, Chia-Wen; Tyan, Yaw-Yauan


    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  12. Cluster analysis in systems of magnetic spheres and cubes

    Energy Technology Data Exchange (ETDEWEB)

    Pyanzina, E.S., E-mail: [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Gudkova, A.V. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Donaldson, J.G. [University of Vienna, Sensengasse 8, Vienna (Austria); Kantorovich, S.S. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); University of Vienna, Sensengasse 8, Vienna (Austria)


    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube. - Highlights: • A comparison of the degree of self-assembly in systems of magnetic spheres and cubes. • Spheres are more likely to form larger clusters than cubes. • Differences in microstructure will manifest in the magnetic response of each system.

  13. Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis (United States)

    Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song


    To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.

  14. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail:


    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  15. Multiscale visual quality assessment for cluster analysis with self-organizing maps (United States)

    Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias


    Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.

  16. Analysis Of The Effect Of Fuel Enrichment Error On Neutronic Properties Of The RSG-GAS Core

    International Nuclear Information System (INIS)

    Saragih, Tukiran; Pinem, Surian


    The analysis of the fuel enrichment error effect on neutronic properties has been carried out. The fuel enrichment could be improperly done because of wrong fabrication. Therefore it is necessary to analyze the fuel enrichment error effect to determine how many percents the fuel enrichment maximum can be accepted in the core. The analysis was done by simulation method The RSG-GAS core was simulated with 5 standard fuels and 1 control element having wrong enrichment when inserted into the core. Fuel enrichment error was then simulated from 20%, 25% and 30% and the simulation was done using WIMSD/4 and Batan-2DIFF codes. The cross section of core material of the RSG-GAS was generated by WIMSD/4 code in 1-D, X-Y geometry and 10 energy neutron group. Two dimensions, diffusion calculation based on finite element method was done by using Batan-2DIFF code. Five fuel elements and one control element changed the enrichment was finally arranged as a new core of the RSG-Gas reactor. The neutronic properties can be seen from eigenvalues (k eff ) as well as from the kinetic properties based on moderator void reactivity coefficient. The calculated results showed that the error are still acceptable by k eff 1,097 even until 25% fuel enrichment but not more than 25,5%

  17. Steady state subchannel analysis of AHWR fuel cluster

    International Nuclear Information System (INIS)

    Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.


    Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)

  18. Analysis of microbial community and nitrogen transition with enriched nitrifying soil microbes for organic hydroponics. (United States)

    Saijai, Sakuntala; Ando, Akinori; Inukai, Ryuya; Shinohara, Makoto; Ogawa, Jun


    Nitrifying microbial consortia were enriched from bark compost in a water system by regulating the amounts of organic nitrogen compounds and by controlling the aeration conditions with addition of CaCO 3 for maintaining suitable pH. Repeated enrichment showed reproducible mineralization of organic nitrogen via the conversion of ammonium ions ([Formula: see text]) and nitrite ions ([Formula: see text]) into nitrate ions ([Formula: see text]). The change in microbial composition during the enrichment was investigated by PCR-DGGE analysis with a focus on prokaryote, ammonia-oxidizing bacteria, nitrite-oxidizing bacteria, and eukaryote cell types. The microbial transition had a simple profile and showed clear relation to nitrogen ions transition. Nitrosomonas and Nitrobacter were mainly detected during [Formula: see text] and [Formula: see text] oxidation, respectively. These results revealing representative microorganisms acting in each ammonification and nitrification stages will be valuable for the development of artificial simple microbial consortia for organic hydroponics that consisted of identified heterotrophs and autotrophic nitrifying bacteria.

  19. Thermogravimetric analysis of rice and wheat straw catalytic combustion in air- and oxygen-enriched atmospheres

    International Nuclear Information System (INIS)

    Yu Zhaosheng; Ma Xiaoqian; Liu Ao


    By thermogravimetric analysis (TGA) study, the influences of different catalysts on the ignition and combustion of rice and wheat straw in air- and oxygen-enriched atmospheres have been investigated in this paper. Straw combustion is divided into two stages. One is the emission and combustion of volatiles and the second is the combustion of fixed carbon. The existence of catalysts in the first step enhances the emission of volatiles from the straw. The action of catalysts in the second step of straw combustion may be as a carrier of oxygen to the fixed carbon. Two parameters have been used to compare the characteristics of ignition and combustion of straw under different catalysts and in various oxygen concentrations. One is the temperature when the conversion degree combustible (CDC) of straw is 5%, the other is the CDC when the temperature is 900 deg. C. By comparing the different values of the two parameters, the different influences of the catalysts and oxygen concentration on the ignition and combustion of straw have been studied, the action of these catalysts for straw ignition and combustion in air and oxygen-enriched atmosphere is effective except the oxygen-enriched catalytic combustion of wheat straw fixed carbon

  20. BiNChE: a web tool and library for chemical enrichment analysis based on the ChEBI ontology. (United States)

    Moreno, Pablo; Beisken, Stephan; Harsha, Bhavana; Muthukrishnan, Venkatesh; Tudose, Ilinca; Dekker, Adriano; Dornfeldt, Stefanie; Taruttis, Franziska; Grosse, Ivo; Hastings, Janna; Neumann, Steffen; Steinbeck, Christoph


    Ontology-based enrichment analysis aids in the interpretation and understanding of large-scale biological data. Ontologies are hierarchies of biologically relevant groupings. Using ontology annotations, which link ontology classes to biological entities, enrichment analysis methods assess whether there is a significant over or under representation of entities for ontology classes. While many tools exist that run enrichment analysis for protein sets annotated with the Gene Ontology, there are only a few that can be used for small molecules enrichment analysis. We describe BiNChE, an enrichment analysis tool for small molecules based on the ChEBI Ontology. BiNChE displays an interactive graph that can be exported as a high-resolution image or in network formats. The tool provides plain, weighted and fragment analysis based on either the ChEBI Role Ontology or the ChEBI Structural Ontology. BiNChE aids in the exploration of large sets of small molecules produced within Metabolomics or other Systems Biology research contexts. The open-source tool provides easy and highly interactive web access to enrichment analysis with the ChEBI ontology tool and is additionally available as a standalone library.

  1. Galactic Pal-eontology: abundance analysis of the disrupting globular cluster Palomar 5 (United States)

    Koch, Andreas; Côté, Patrick


    We present a chemical abundance analysis of the tidally disrupted globular cluster (GC) Palomar 5. By co-adding high-resolution spectra of 15 member stars from the cluster's main body, taken at low signal-to-noise with the Keck/HIRES spectrograph, we were able to measure integrated abundance ratios of 24 species of 20 elements including all major nucleosynthetic channels (namely the light element Na; α-elements Mg, Si, Ca, Ti; Fe-peak and heavy elements Sc, V, Cr, Mn, Co, Ni, Cu, Zn; and the neutron-capture elements Y, Zr, Ba, La, Nd, Sm, Eu). The mean metallicity of -1.56 ± 0.02 ± 0.06 dex (statistical and systematic errors) agrees well with the values from individual, low-resolution measurements of individual stars, but it is lower than previous high-resolution results of a small number of stars in the literature. Comparison with Galactic halo stars and other disrupted and unperturbed GCs renders Pal 5 a typical representative of the Milky Way halo population, as has been noted before, emphasizing that the early chemical evolution of such clusters is decoupled from their later dynamical history. We also performed a test as to the detectability of light element variations in this co-added abundance analysis technique and found that this approach is not sensitive even in the presence of a broad range in sodium of 0.6 dex, a value typically found in the old halo GCs. Thus, while methods of determining the global abundance patterns of such objects are well suited to study their overall enrichment histories, chemical distinctions of their multiple stellar populations is still best obtained from measurements of individual stars. Full Table 3 is is only available at the CDS via anonymous ftp to ( or via

  2. Analysis of Learning Development With Sugeno Fuzzy Logic And Clustering

    Directory of Open Access Journals (Sweden)

    Maulana Erwin Saputra


    Full Text Available In the first journal, I made this attempt to analyze things that affect the achievement of students in each school of course vary. Because students are one of the goals of achieving the goals of successful educational organizations. The mental influence of students’ emotions and behaviors themselves in relation to learning performance. Fuzzy logic can be used in various fields as well as Clustering for grouping, as in Learning Development analyzes. The process will be performed on students based on the symptoms that exist. In this research will use fuzzy logic and clustering. Fuzzy is an uncertain logic but its excess is capable in the process of language reasoning so that in its design is not required complicated mathematical equations. However Clustering method is K-Means method is method where data analysis is broken down by group k (k = 1,2,3, .. k. To know the optimal number of Performance group. The results of the research is with a questionnaire entered into matlab will produce a value that means in generating the graph. And simplify the school in seeing Student performance in the learning process by using certain criteria. So from the system that obtained the results for a decision-making required by the school.

  3. Segmentation of Residential Gas Consumers Using Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Marta P. Fernandes


    Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.

  4. Enrichment and proteomic analysis of plasma membrane from rat dorsal root ganglions

    Directory of Open Access Journals (Sweden)

    Lin Yong


    Full Text Available Abstract Background Dorsal root ganglion (DRG neurons are primary sensory neurons that conduct neuronal impulses related to pain, touch and temperature senses. Plasma membrane (PM of DRG cells plays important roles in their functions. PM proteins are main performers of the functions. However, mainly due to the very low amount of DRG that leads to the difficulties in PM sample collection, few proteomic analyses on the PM have been reported and it is a subject that demands further investigation. Results By using aqueous polymer two-phase partition in combination with high salt and high pH washing, PMs were efficiently enriched, demonstrated by western blot analysis. A total of 954 non-redundant proteins were identified from the plasma membrane-enriched preparation with CapLC-MS/MS analysis subsequent to protein separation by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE or shotgun digestion. 205 (21.5% of the identified proteins were unambiguously assigned as PM proteins, including a large number of signal proteins, receptors, ion channel and transporters. Conclusion The aqueous polymer two-phase partition is a simple, rapid and relatively inexpensive method. It is well suitable for the purification of PMs from small amount of tissues. Therefore, it is reasonable for the DRG PM to be enriched by using aqueous two-phase partition as a preferred method. Proteomic analysis showed that DRG PM was rich in proteins involved in the fundamental biological processes including material exchange, energy transformation and information transmission, etc. These data would help to our further understanding of the fundamental DRG functions.

  5. A view of the H-band light-element chemical patterns in globular clusters under the AGB self-enrichment scenario (United States)

    Dell'Agli, F.; García-Hernández, D. A.; Ventura, P.; Mészáros, Sz; Masseron, T.; Fernández-Trincado, J. G.; Tang, B.; Shetrone, M.; Zamora, O.; Lucatello, S.


    We discuss the self-enrichment scenario by asymptotic giant branch (AGB) stars for the formation of multiple populations in globular clusters (GCs) by analysing data set of giant stars observed in nine Galactic GCs, covering a wide range of metallicities and for which the simultaneous measurements of C, N, O, Mg, Al, and Si are available. To this aim, we calculated six sets of AGB models, with the same chemical composition as the stars belonging to the first generation of each GC. We find that the AGB yields can reproduce the set of observations available, not only in terms of the degree of contamination shown by stars in each GC but, more important, also the observed trend with metallicity, which agrees well with the predictions from AGB evolution modelling. While further observational evidences are required to definitively fix the main actors in the pollution of the interstellar medium from which new generation of stars formed in GCs, the present results confirm that the gas ejected by stars of mass in the range 4 M_{⊙} ≤ M ≤ 8 M_{⊙} during the AGB phase share the same chemical patterns traced by stars in GCs.

  6. Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters (United States)

    Muraoka, Masae; Okuda, Hiroshi

    With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.

  7. Enrichment Strategies in Pediatric Drug Development: An Analysis of Trials Submitted to the US Food and Drug Administration. (United States)

    Green, Dionna J; Liu, Xiaomei I; Hua, Tianyi; Burnham, Janelle M; Schuck, Robert; Pacanowski, Michael; Yao, Lynne; McCune, Susan K; Burckart, Gilbert J; Zineh, Issam


    Clinical trial enrichment involves prospectively incorporating trial design elements that increase the probability of detecting a treatment effect. The use of enrichment strategies in pediatric drug development has not been systematically assessed. We analyzed the use of enrichment strategies in pediatric trials submitted to the US Food and Drug Administration from 2012-2016. In all, 112 efficacy studies associated with 76 drug development programs were assessed and their overall success rates were 78% and 75%, respectively. Eighty-eight trials (76.8%) employed at least one enrichment strategy; of these, 66.3% employed multiple enrichment strategies. The highest trial success rates were achieved when all three enrichment strategies (practical, predictive, and prognostic) were used together within a single trial (87.5%), while the lowest success rate was observed when no enrichment strategy was used (65.4%). The use of enrichment strategies in pediatric trials was found to be associated with trial and program success in our analysis. © 2017 American Society for Clinical Pharmacology and Therapeutics.

  8. Nonproliferation analysis of the reduction of excess separated plutonium and high-enriched uranium

    International Nuclear Information System (INIS)

    Persiani, P.J.


    The purpose of this preliminary investigation is to explore alternatives and strategies aimed at the gradual reduction of the excess inventories of separated plutonium and high-enriched uranium (HEU) in the civilian nuclear power industry. The study attempts to establish a technical and economic basis to assist in the formation of alternative approaches consistent with nonproliferation and safeguards concerns. The analysis addresses several options in reducing the excess separated plutonium and HEU, and the consequences on nonproliferation and safeguards policy assessments resulting from the interacting synergistic effects between fuel cycle processes and isotopic signatures of nuclear materials

  9. Analysis of civilian processing programs in reduction of excess separated plutonium and high-enriched uranium

    International Nuclear Information System (INIS)

    Persiani, P.J.


    The purpose of this preliminary investigation is to explore alternatives and strategies aimed at the gradual reduction of the excess inventories of separated plutonium and high-enriched uranium (HEU) in the civilian nuclear power industry. The study attempts to establish a technical and economic basis to assist in the formation of alternative approaches consistent with nonproliferation and safeguards concerns. The analysis addresses several options in reducing the excess separated plutonium and HEU, and the consequences on nonproliferation and safeguards policy assessments resulting from the interacting synergistic effects between fuel cycle processes and isotopic signatures of nuclear materials

  10. Cluster analysis in systems of magnetic spheres and cubes (United States)

    Pyanzina, E. S.; Gudkova, A. V.; Donaldson, J. G.; Kantorovich, S. S.


    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube.

  11. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.


    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...

  12. Enrichment and Molecular Analysis of Breast Cancer Disseminated Tumor Cells from Bone Marrow Using Microfiltration.

    Directory of Open Access Journals (Sweden)

    Sreeraj G Pillai

    Full Text Available Molecular characterization of disseminated tumor cells (DTCs in the bone marrow (BM of breast cancer (BC patients has been hindered by their rarity. To enrich for these cells using an antigen-independent methodology, we have evaluated a size-based microfiltration device in combination with several downstream biomarker assays.BM aspirates were collected from healthy volunteers or BC patients. Healthy BM was mixed with a specified number of BC cells to calculate recovery and fold enrichment by microfiltration. Specimens were pre-filtered using a 70 μm mesh sieve and the effluent filtered through CellSieve microfilters. Captured cells were analyzed by immunocytochemistry (ICC, FISH for HER-2/neu gene amplification status, and RNA in situ hybridization (RISH. Cells eluted from the filter were used for RNA isolation and subsequent qRT-PCR analysis for DTC biomarker gene expression.Filtering an average of 14×106 nucleated BM cells yielded approximately 17-21×103 residual BM cells. In the BC cell spiking experiments, an average of 87% (range 84-92% of tumor cells were recovered with approximately 170- to 400-fold enrichment. Captured BC cells from patients co-stained for cytokeratin and EpCAM, but not CD45 by ICC. RNA yields from 4 ml of patient BM after filtration averaged 135ng per 10 million BM cells filtered with an average RNA Integrity Number (RIN of 5.3. DTC-associated gene expression was detected by both qRT-PCR and RISH in filtered spiked or BC patient specimens but, not in control filtered normal BM.We have tested a microfiltration technique for enrichment of BM DTCs. DTC capture efficiency was shown to range from 84.3% to 92.1% with up to 400-fold enrichment using model BC cell lines. In patients, recovered DTCs can be identified and distinguished from normal BM cells using multiple antibody-, DNA-, and RNA-based biomarker assays.

  13. A cluster analysis on road traffic accidents using genetic algorithms (United States)

    Saharan, Sabariah; Baragona, Roberto


    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  14. Expressed sequence enrichment for candidate gene analysis of citrus tristeza virus resistance. (United States)

    Bernet, G P; Bretó, M P; Asins, M J


    Several studies have reported markers linked to a putative resistance gene from Poncirus trifoliata ( Ctv-R) located at linkage group 4 that confers resistance against one of the most important citrus pathogens, citrus tristeza virus (CTV). To be successful in both marker-assisted selection and transformation experiments, its accurate mapping is needed. Several factors may affect its localization, among them two are considered here: the definition of resistance and the genetic background of progeny. Two progenies derived from P. trifoliata, by self-pollination and by crossing with sour orange ( Citrus aurantium), a citrus rootstock well-adapted to arid and semi-arid areas, were used for linkage group-4 marker enrichment. Two new methodologies were used to enrich this region with expressed sequences. The enrichment of group 4 resulted in the fusion of several C. aurantium linkage groups. The new one A(7+3+4) is now saturated with 48 markers including expressed sequences. Surprisingly, sour orange was as resistant to the CTV isolate tested as was P. trifoliata, and three hybrids that carry Ctv-R, as deduced from its flanking markers, are susceptible to CTV. The new linkage maps were used to map Ctv-R under the hypothesis of monogenic inheritance. Its position on linkage group 4 of P. trifoliata differs from the location previously reported in other progenies. The genetic analysis of virus-plant interaction in the family derived from C. aurantium after a CTV chronic infection showed the segregation of five types of interaction, which is not compatible with the hypothesis of a single gene controlling resistance. Two major issues are discussed: another type of genetic analysis of CTV resistance is needed to avoid the assumption of monogenic inheritance, and transferring Ctv-R from P. trifoliata to sour orange might not avoid the CTV decline of sweet orange trees.

  15. The integration of expert-defined importance factors to enrich Bayesian Fault Tree Analysis

    International Nuclear Information System (INIS)

    Darwish, Molham; Almouahed, Shaban; Lamotte, Florent de


    This paper proposes an analysis of a hybrid Bayesian-Importance model for system designers to improve the quality of services related to Active Assisted Living Systems. The proposed model is based on two factors: failure probability measure of different service components and, an expert defined degree of importance that each component holds for the success of the corresponding service. The proposed approach advocates the integration of expert-defined importance factors to enrich the Bayesian Fault Tree Analysis (FTA) approach. The evaluation of the proposed approach is conducted using the Fault Tree Analysis formalism where the undesired state of a system is analyzed using Boolean logic mechanisms to combine a series of lower-level events.

  16. Development of three-dimensional ENRICHED FREE MESH METHOD and its application to crack analysis

    International Nuclear Information System (INIS)

    Suzuki, Hayato; Matsubara, Hitoshi; Ezawa, Yoshitaka; Yagawa, Genki


    In this paper, we describe a method for three-dimensional high accurate analysis of a crack included in a large-scale structure. The Enriched Free Mesh Method (EFMM) is a method for improving the accuracy of the Free Mesh Method (FMM), which is a kind of meshless method. First, we developed an algorithm of the three-dimensional EFMM. The elastic problem was analyzed using the EFMM and we find that its accuracy compares advantageously with the FMM, and the number of CG iterations is smaller. Next, we developed a method for calculating the stress intensity factor by employing the EFMM. The structure with a crack was analyzed using the EFMM, and the stress intensity factor was calculated by the developed method. The analysis results were very well in agreement with reference solution. It was shown that the proposed method is very effective in the analysis of the crack included in a large-scale structure. (author)

  17. Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

    International Nuclear Information System (INIS)

    Zeng, J.; Li, G.; Sun, J.


    Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)

  18. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center. (United States)

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat


    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  19. [Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis]. (United States)

    Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés


    To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  20. Length bias correction in gene ontology enrichment analysis using logistic regression. (United States)

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H


    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  1. Reliability analysis of cluster-based ad-hoc networks

    International Nuclear Information System (INIS)

    Cook, Jason L.; Ramirez-Marquez, Jose Emmanuel


    The mobile ad-hoc wireless network (MAWN) is a new and emerging network scheme that is being employed in a variety of applications. The MAWN varies from traditional networks because it is a self-forming and dynamic network. The MAWN is free of infrastructure and, as such, only the mobile nodes comprise the network. Pairs of nodes communicate either directly or through other nodes. To do so, each node acts, in turn, as a source, destination, and relay of messages. The virtue of a MAWN is the flexibility this provides; however, the challenge for reliability analyses is also brought about by this unique feature. The variability and volatility of the MAWN configuration makes typical reliability methods (e.g. reliability block diagram) inappropriate because no single structure or configuration represents all manifestations of a MAWN. For this reason, new methods are being developed to analyze the reliability of this new networking technology. New published methods adapt to this feature by treating the configuration probabilistically or by inclusion of embedded mobility models. This paper joins both methods together and expands upon these works by modifying the problem formulation to address the reliability analysis of a cluster-based MAWN. The cluster-based MAWN is deployed in applications with constraints on networking resources such as bandwidth and energy. This paper presents the problem's formulation, a discussion of applicable reliability metrics for the MAWN, and illustration of a Monte Carlo simulation method through the analysis of several example networks

  2. Shape Analysis of HII Regions - I. Statistical Clustering (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred


    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  3. Time series clustering analysis of health-promoting behavior (United States)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng


    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  4. Cluster, adaptation and extroversion : a cognitive and entrepreneurial analysis of the Marche music cluster

    NARCIS (Netherlands)

    Tappi, D.


    Over recent decades, clusters like industrial districts have increasingly attracted attention in economic debate. The study of clusters, particularly in the Italian literature, highlights the inadequacy of the mainstream body of explanation to provide a theory of the emergence and transformation

  5. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma. (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D


    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  6. Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

    International Nuclear Information System (INIS)

    Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.


    For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)

  7. Transcriptomic analysis of the effects of a fish oil enriched diet on murine brains.

    Directory of Open Access Journals (Sweden)

    Rasha Hammamieh

    Full Text Available The health benefits of fish oil enriched with high omega-3 polyunsaturated fatty acids (n-3 PUFA are widely documented. Fish oil as dietary supplements, however, show moderate clinical efficacy, highlighting an immediate scope of systematic in vitro feedback. Our transcriptomic study was designed to investigate the genomic shift of murine brains fed on fish oil enriched diets. A customized fish oil enriched diet (FD and standard lab diet (SD were separately administered to two randomly chosen populations of C57BL/6J mice from their weaning age until late adolescence. Statistical analysis mined 1,142 genes of interest (GOI differentially altered in the hemibrains collected from the FD- and SD-fed mice at the age of five months. The majority of identified GOI (∼ 40% encodes proteins located in the plasma membrane, suggesting that fish oil primarily facilitated the membrane-oriented biofunctions. FD potentially augmented the nervous system's development and functions by selectively stimulating the Src-mediated calcium-induced growth cascade and the downstream PI3K-AKT-PKC pathways. FD reduced the amyloidal burden, attenuated oxidative stress, and assisted in somatostatin activation-the signatures of attenuation of Alzheimer's disease, Parkinson's disease, and affective disorder. FD induced elevation of FKBP5 and suppression of BDNF, which are often linked with the improvement of anxiety disorder, depression, and post-traumatic stress disorder. Hence we anticipate efficacy of FD in treating illnesses such as depression that are typically triggered by the hypoactivities of dopaminergic, adrenergic, cholinergic, and GABAergic networks. Contrastingly, FD's efficacy could be compromised in treating illnesses such as bipolar disorder and schizophrenia, which are triggered by hyperactivities of the same set of neuromodulators. A more comprehensive investigation is recommended to elucidate the implications of fish oil on disease pathomechanisms, and the

  8. Sensitization trajectories in childhood revealed by using a cluster analysis

    DEFF Research Database (Denmark)

    Schoos, Ann-Marie M.; Chawes, Bo L.; Melen, Erik


    Prospective Studies on Asthma in Childhood 2000 (COPSAC2000) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent......BACKGROUND: Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more...... biologically and clinically relevant. OBJECTIVE: We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. METHODS: We investigated 398 children from the at-risk Copenhagen...

  9. Integrating PROOF Analysis in Cloud and Batch Clusters

    International Nuclear Information System (INIS)

    Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto


    High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.

  10. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi


    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  11. Spectral analysis of the He-enriched sdO-star HD 127493 (United States)

    Dorsch, Matti; Latour, Marilyn; Heber, Ulrich


    The bright sdO star HD127493 is known to be of mixed H/He composition and excellent archival spectra covering both optical and ultraviolet ranges are available. UV spectra play a key role as they give access to many chemical species that do not show spectral lines in the optical, such as iron and nickel. This encouraged the quantitative spectral analysis of this prototypical mixed H/He composition sdO star. We determined atmospheric parameters for HD127493 in addition to the abundance of C, N, O, Si, S, Fe, and Ni in the atmosphere using non-LTE model atmospheres calculated with TLUSTY/SYNSPEC. A comparison between the parallax distance measured by Hipparcos and the derived spectroscopic distance indicate that the derived atmospheric parameters are realistic. From our metal abundance analysis, we find a strong CNO signature and enrichment in iron and nickel.

  12. Cluster analysis of received constellations for optical performance monitoring

    NARCIS (Netherlands)

    van Weerdenburg, J.J.A.; van Uden, R.; Sillekens, E.; de Waardt, H.; Koonen, A.M.J.; Okonkwo, C.


    Performance monitoring based on centroid clustering to investigate constellation generation offsets. The tool allows flexibility in constellation generation tolerances by forwarding centroids to the demapper. The relation of fibre nonlinearities and singular value decomposition of intra-cluster

  13. The composite sequential clustering technique for analysis of multispectral scanner data (United States)

    Su, M. Y.


    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  14. Genetic Diversity and Relationships of Neolamarckia cadamba (Roxb. Bosser progenies through cluster analysis

    Directory of Open Access Journals (Sweden)

    M. Preethi Shree


    Full Text Available Genetic diversity analysis was conducted for biometric attributes in 20 progenies of Neolamarckia cadamba. The application of D2 clustering technique in Neolamarckia cadamba genetic resources resolved the 20 progenies into five clusters. The maximum intra cluster distance was shown by the cluster II. The maximum inter cluster distance was recorded between cluster III and V which indicated the presence of wider genetic distance between Neolamarckia cadamba progenies. Among the growth attributes, volume (36.84 % contributed maximum towards genetic divergence followed by bole height, basal diameter, tree height, number of branches in Neolamarckia cadamba progenies.

  15. Multivariate analysis of the heterogeneous geochemical processes controlling arsenic enrichment in a shallow groundwater system. (United States)

    Huang, Shuangbing; Liu, Changrong; Wang, Yanxin; Zhan, Hongbin


    The effects of various geochemical processes on arsenic enrichment in a high-arsenic aquifer at Jianghan Plain in Central China were investigated using multivariate models developed from combined adaptive neuro-fuzzy inference system (ANFIS) and multiple linear regression (MLR). The results indicated that the optimum variable group for the AFNIS model consisted of bicarbonate, ammonium, phosphorus, iron, manganese, fluorescence index, pH, and siderite saturation. These data suggest that reductive dissolution of iron/manganese oxides, phosphate-competitive adsorption, pH-dependent desorption, and siderite precipitation could integrally affect arsenic concentration. Analysis of the MLR models indicated that reductive dissolution of iron(III) was primarily responsible for arsenic mobilization in groundwaters with low arsenic concentration. By contrast, for groundwaters with high arsenic concentration (i.e., > 170 μg/L), reductive dissolution of iron oxides approached a dynamic equilibrium. The desorption effects from phosphate-competitive adsorption and the increase in pH exhibited arsenic enrichment superior to that caused by iron(III) reductive dissolution as the groundwater chemistry evolved. The inhibition effect of siderite precipitation on arsenic mobilization was expected to exist in groundwater that was highly saturated with siderite. The results suggest an evolutionary dominance of specific geochemical process over other factors controlling arsenic concentration, which presented a heterogeneous distribution in aquifers. Supplemental materials are available for this article. Go to the publisher's online edition of the Journal of Environmental Science and Health, Part A, to view the supplemental file.

  16. QTL global meta-analysis: are trait determining genes clustered?

    Directory of Open Access Journals (Sweden)

    Adelson David L


    Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.

  17. Cluster decay analysis and related structure effects of fissionable ...

    Indian Academy of Sciences (India)


    Aug 1, 2015 ... Collective clusterization approach of dynamical cluster decay model (DCM) has been ... fusion–fission process resulting in the emission of symmetric and/or ... represents the relative separation distance between two fragments or clusters ... decay constant λ or decay half-life T1/2 is defined as λ = (ln 2/T1/2) ...

  18. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)


    Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.

  19. Hybrid Tracking Algorithm Improvements and Cluster Analysis Methods. (United States)


    UPGMA ), and Ward’s method. Ling’s papers describe a (k,r) clustering method. Each of these methods have individual characteristics which make them...Reference 7), UPGMA is probably the most frequently used clustering strategy. UPGMA tries to group new points into an existing cluster by using an


    Directory of Open Access Journals (Sweden)



    Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.

  1. MMPI profiles of males accused of severe crimes: a cluster analysis

    NARCIS (Netherlands)

    Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.


    In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious

  2. Cluster analysis of rural, urban, and curbside atmospheric particle size data. (United States)

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M


    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  3. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms. (United States)

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John


    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Uranium enrichment. Enrichment processes

    International Nuclear Information System (INIS)

    Alexandre, M.; Quaegebeur, J.P.


    Despite the remarkable progresses made in the diversity and the efficiency of the different uranium enrichment processes, only two industrial processes remain today which satisfy all of enriched uranium needs: the gaseous diffusion and the centrifugation. This article describes both processes and some others still at the demonstration or at the laboratory stage of development: 1 - general considerations; 2 - gaseous diffusion: physical principles, implementation, utilisation in the world; 3 - centrifugation: principles, elementary separation factor, flows inside a centrifuge, modeling of separation efficiencies, mechanical design, types of industrial centrifuges, realisation of cascades, main characteristics of the centrifugation process; 4 - aerodynamic processes: vortex process, nozzle process; 5 - chemical exchange separation processes: Japanese ASAHI process, French CHEMEX process; 6 - laser-based processes: SILVA process, SILMO process; 7 - electromagnetic and ionic processes: mass spectrometer and calutron, ion cyclotron resonance, rotating plasmas; 8 - thermal diffusion; 9 - conclusion. (J.S.)

  5. The relationship between supplier networks and industrial clusters: an analysis based on the cluster mapping method

    Directory of Open Access Journals (Sweden)

    Ichiro IWASAKI


    Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.

  6. Higgs Pair Production: Choosing Benchmarks With Cluster Analysis

    CERN Document Server

    Carvalho, Alexandra; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia


    New physics theories often depend on a large number of free parameters. The precise values of those parameters in some cases drastically affect the resulting phenomenology of fundamental physics processes, while in others finite variations can leave it basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics of different models; a clustering algorithm using that metric may then allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmark points are then guaranteed to be sensitive to a large area of the parameter space. In this doc...

  7. Early repositioning through compound set enrichment analysis: a knowledge-recycling strategy. (United States)

    Temesi, Gergely; Bolgár, Bence; Arany, Adám; Szalai, Csaba; Antal, Péter; Mátyus, Péter


    Despite famous serendipitous drug repositioning success stories, systematic projects have not yet delivered the expected results. However, repositioning technologies are gaining ground in different phases of routine drug development, together with new adaptive strategies. We demonstrate the power of the compound information pool, the ever-growing heterogeneous information repertoire of approved drugs and candidates as an invaluable catalyzer in this transition. Systematic, computational utilization of this information pool for candidates in early phases is an open research problem; we propose a novel application of the enrichment analysis statistical framework for fusion of this information pool, specifically for the prediction of indications. Pharmaceutical consequences are formulated for a systematic and continuous knowledge recycling strategy, utilizing this information pool throughout the drug-discovery pipeline.

  8. Vibration signature analysis of compressors in the gaseous diffusion process for uranium enrichment

    International Nuclear Information System (INIS)

    Harbarger, W.B.


    Continuous operation of several thousand axial-flow and centrifugal compressors is vital to the gaseous diffusion process for uranium enrichment. Vibration signature analysis using a minicomputer-based Fast Fourier Transform Analyzer is being applied to the evaluation and surveillance of compressor performance at the Portsmouth Gaseous Diffusion Plant. Three areas of application include: (1) new blade design and prototype compressor evaluation; (2) corrective and preventive maintenance of machinery components; and (3) evaluation of machinery health. The present system is being used to monitor signals from accelerometers mounted on the load-bearing housings of 16 on-line compressors. These signals are transmitted by hard-wire to the analyzer for daily monitoring. A program for expansion of this system to monitor more than a thousand compressors and automation of the signature comparison process is planned for all three gaseous diffusion plants operated for the United States Energy Research and Development Administration. (auth)

  9. Microarray Cluster Analysis of Irradiated Growth Plate Zones Following Laser Microdissection

    International Nuclear Information System (INIS)

    Damron, Timothy A.; Zhang Mingliang; Pritchard, Meredith R.; Middleton, Frank A.; Horton, Jason A.; Margulies, Bryan M.; Strauss, Judith A.; Farnum, Cornelia E.; Spadaro, Joseph A.


    Purpose: Genes and pathways involved in early growth plate chondrocyte recovery after fractionated irradiation were sought as potential targets for selective radiorecovery modulation. Materials and Methods: Three groups of six 5-week male Sprague-Dawley rats underwent fractionated irradiation to the right tibiae over 5 days, totaling 17.5 Gy, and then were killed at 7, 11, and 16 days after the first radiotherapy fraction. The growth plates were collected from the proximal tibiae bilaterally and subsequently underwent laser microdissection to separate reserve, perichondral, proliferative, and hypertrophic zones. Differential gene expression was analyzed between irradiated right and nonirradiated left tibia using RAE230 2.0 GeneChip microarray, compared between zones and time points and subjected to functional pathway cluster analysis with real-time polymerase chain reaction to confirm selected results. Results: Each zone had a number of pathways showing enrichment after the pattern of hypothesized importance to growth plate recovery, yet few met the strictest criteria. The proliferative and hypertrophic zones showed both the greatest number of genes with a 10-fold right/left change at 7 days after initiation of irradiation and enrichment of the most functional pathways involved in bone, cartilage, matrix, or skeletal development. Six genes confirmed by real-time polymerase chain reaction to have early upregulation included insulin-like growth factor 2, procollagen type I alpha 2, matrix metallopeptidase 9, parathyroid hormone receptor 1, fibromodulin, and aggrecan 1. Conclusions: Nine overlapping pathways in the proliferative and hypertrophic zones (skeletal development, ossification, bone remodeling, cartilage development, extracellular matrix structural constituent, proteinaceous extracellular matrix, collagen, extracellular matrix, and extracellular matrix part) may play key roles in early growth plate radiorecovery.

  10. Performance analysis of clustering techniques over microarray data: A case study (United States)

    Dash, Rasmita; Misra, Bijan Bihari


    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  11. Uranium enrichment

    International Nuclear Information System (INIS)


    This paper reports that in 1990 the Department of Energy began a two-year project to illustrate the technical and economic feasibility of a new uranium enrichment technology-the atomic vapor laser isotope separation (AVLIS) process. GAO believes that completing the AVLIS demonstration project will provide valuable information about the technical viability and cost of building an AVLIS plant and will keep future plant construction options open. However, Congress should be aware that DOE still needs to adequately demonstrate AVLIS with full-scale equipment and develop convincing cost projects. Program activities, such as the plant-licensing process, that must be completed before a plant is built, could take many years. Further, an updated and expanded uranium enrichment analysis will be needed before any decision is made about building an AVLIS plant. GAO, which has long supported legislation that would restructure DOE's uranium enrichment program as a government corporation, encourages DOE's goal of transferring AVLIS to the corporation. This could reduce the government's financial risk and help ensure that the decision to build an AVLIS plant is based on commercial concerns. DOE, however, has no alternative plans should the government corporation not be formed. Further, by curtailing a planned public access program, which would have given private firms an opportunity to learn about the technology during the demonstration project, DOE may limit its ability to transfer AVLIS to the private sector

  12. SNP-based pathway enrichment analysis for genome-wide association studies

    Directory of Open Access Journals (Sweden)

    Potkin Steven G


    Full Text Available Abstract Background Recently we have witnessed a surge of interest in using genome-wide association studies (GWAS to discover the genetic basis of complex diseases. Many genetic variations, mostly in the form of single nucleotide polymorphisms (SNPs, have been identified in a wide spectrum of diseases, including diabetes, cancer, and psychiatric diseases. A common theme arising from these studies is that the genetic variations discovered by GWAS can only explain a small fraction of the genetic risks associated with the complex diseases. New strategies and statistical approaches are needed to address this lack of explanation. One such approach is the pathway analysis, which considers the genetic variations underlying a biological pathway, rather than separately as in the traditional GWAS studies. A critical challenge in the pathway analysis is how to combine evidences of association over multiple SNPs within a gene and multiple genes within a pathway. Most current methods choose the most significant SNP from each gene as a representative, ignoring the joint action of multiple SNPs within a gene. This approach leads to preferential identification of genes with a greater number of SNPs. Results We describe a SNP-based pathway enrichment method for GWAS studies. The method consists of the following two main steps: 1 for a given pathway, using an adaptive truncated product statistic to identify all representative (potentially more than one SNPs of each gene, calculating the average number of representative SNPs for the genes, then re-selecting the representative SNPs of genes in the pathway based on this number; and 2 ranking all selected SNPs by the significance of their statistical association with a trait of interest, and testing if the set of SNPs from a particular pathway is significantly enriched with high ranks using a weighted Kolmogorov-Smirnov test. We applied our method to two large genetically distinct GWAS data sets of schizophrenia, one

  13. Depth data research of GIS based on clustering analysis algorithm (United States)

    Xiong, Yan; Xu, Wenli


    The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.

  14. Regularized rare variant enrichment analysis for case-control exome sequencing data. (United States)

    Larson, Nicholas B; Schaid, Daniel J


    Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.

  15. Capillary electrophoresis - Mass spectrometry metabolomics analysis revealed enrichment of hypotaurine in rat glioma tissues. (United States)

    Gao, Peng; Ji, Min; Fang, Xueyan; Liu, Yingyang; Yu, Zhigang; Cao, Yunfeng; Sun, Aijun; Zhao, Liang; Zhang, Yong


    Glioma is one of the most lethal brain malignancies with unknown etiologies. Many metabolomics analysis aiming at diverse kinds of samples had been performed. Due to the varied adopted analytical platforms, the reported disease-related metabolites were not consistent across different studies. Comparable metabolomics results are more likely to be acquired by analyzing the same sample types with identical analytical platform. For tumor researches, tissue samples metabolomics analysis own the unique advantage that it can gain more direct insight into disease-specific pathological molecules. In this light, a previous reported capillary electrophoresis - mass spectrometry human tissues metabolomics analysis method was employed to profile the metabolome of rat C6 cell implantation gliomas and the corresponding precancerous tissues. It was found that 9 metabolites increased in the glioma tissues. Of them, hypotaurine was the only metabolite that enriched in the malignant tissues as what had been reported in the relevant human tissues metabolomics analysis. Furthermore, hypotaurine was also proved to inhibit α-ketoglutarate-dependent dioxygenases (2-KDDs) through immunocytochemistry staining and in vitro enzymatic activity assays by using C6 cell cultures. This study reinforced the previous conclusion that hypotaurine acted as a competitive inhibitor of 2-KDDs and proved the value of metabolomics in oncology studies. Copyright © 2017. Published by Elsevier Inc.

  16. Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

    Directory of Open Access Journals (Sweden)

    Marco Borri

    Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.

  17. Analysis of the dynamical cluster approximation for the Hubbard model


    Aryanpour, K.; Hettler, M. H.; Jarrell, M.


    We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...

  18. X-Ray Morphological Analysis of the Planck ESZ Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Andrade-Santos, Felipe; Randall, Scott; Kraft, Ralph [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ettori, Stefano [INAF, Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127 Bologna (Italy); Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DRF—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France)


    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  19. CAGEd-oPOSSUM: motif enrichment analysis from CAGE-derived TSSs. (United States)

    Arenillas, David J; Forrest, Alistair R R; Kawaji, Hideya; Lassmann, Timo; Wasserman, Wyeth W; Mathelier, Anthony


    With the emergence of large-scale Cap Analysis of Gene Expression (CAGE) datasets from individual labs and the FANTOM consortium, one can now analyze the cis-regulatory regions associated with gene transcription at an unprecedented level of refinement. By coupling transcription factor binding site (TFBS) enrichment analysis with CAGE-derived genomic regions, CAGEd-oPOSSUM can identify TFs that act as key regulators of genes involved in specific mammalian cell and tissue types. The webtool allows for the analysis of CAGE-derived transcription start sites (TSSs) either provided by the user or selected from ∼1300 mammalian samples from the FANTOM5 project with pre-computed TFBS predicted with JASPAR TF binding profiles. The tool helps power insights into the regulation of genes through the study of the specific usage of TSSs within specific cell types and/or under specific conditions. The CAGEd-oPOSUM web tool is implemented in Perl, MySQL and Apache and is available at CONTACTS: or Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  20. Integrative Analysis of Gene Expression Data Including an Assessment of Pathway Enrichment for Predicting Prostate Cancer

    Directory of Open Access Journals (Sweden)

    Pingzhao Hu


    Full Text Available Background: Microarray technology has been previously used to identify genes that are differentially expressed between tumour and normal samples in a single study, as well as in syntheses involving multiple studies. When integrating results from several Affymetrix microarray datasets, previous studies summarized probeset-level data, which may potentially lead to a loss of information available at the probe-level. In this paper, we present an approach for integrating results across studies while taking probe-level data into account. Additionally, we follow a new direction in the analysis of microarray expression data, namely to focus on the variation of expression phenotypes in predefined gene sets, such as pathways. This targeted approach can be helpful for revealing information that is not easily visible from the changes in the individual genes. Results: We used a recently developed method to integrate Affymetrix expression data across studies. The idea is based on a probe-level based test statistic developed for testing for differentially expressed genes in individual studies. We incorporated this test statistic into a classic random-effects model for integrating data across studies. Subsequently, we used a gene set enrichment test to evaluate the significance of enriched biological pathways in the differentially expressed genes identified from the integrative analysis. We compared statistical and biological significance of the prognostic gene expression signatures and pathways identified in the probe-level model (PLM with those in the probeset-level model (PSLM. Our integrative analysis of Affymetrix microarray data from 110 prostate cancer samples obtained from three studies reveals thousands of genes significantly correlated with tumour cell differentiation. The bioinformatics analysis, mapping these genes to the publicly available KEGG database, reveals evidence that tumour cell differentiation is significantly associated with many

  1. Identification and validation of asthma phenotypes in Chinese population using cluster analysis. (United States)

    Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang


    Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  2. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK


    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.


    Directory of Open Access Journals (Sweden)

    Hermanto Hermanto


    Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.

  4. Analysis of genetic association using hierarchical clustering and cluster validation indices. (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L


    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results. (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario


    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at

  6. Cluster analysis of HZE particle tracks as applied to space radiobiology problems

    International Nuclear Information System (INIS)

    Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.


    A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer

  7. Application of cluster analysis and unsupervised learning to multivariate tissue characterization

    International Nuclear Information System (INIS)

    Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.


    This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data

  8. Participant intimacy: A cluster analysis of the intranuclear cascade

    International Nuclear Information System (INIS)

    Cugnon, J.; Knoll, J.; Randrup, J.


    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of clusters consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ( rows-on-rows ). (orig.)

  9. An evaluation of centrality measures used in cluster analysis (United States)

    Engström, Christopher; Silvestrov, Sergei


    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  10. NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis. (United States)

    Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun


    High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

  11. A comparison of heuristic and model-based clustering methods for dietary pattern analysis. (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia


    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  12. Thermodynamic Analysis of Oxygen-Enriched Direct Smelting of Jamesonite Concentrate (United States)

    Zhang, Zhong-Tang; Dai, Xi; Zhang, Wen-Hai


    Thermodynamic analysis of oxygen-enriched direct smelting of jamesonite concentrate is reported in this article. First, the occurrence state of lead, antimony and other metallic elements in the smelting process was investigated theoretically. Then, the verification test was carried out. The results indicate that lead and antimony mainly exist in the alloy in the form of metallic lead and metallic antimony. Simultaneously, lead and antimony were also oxidized into the slag in the form of lead-antimony oxide. Iron and copper could be oxidized into the slag in the form of oxides in addition to combining with antimony in the alloy, while zinc was mainly oxidized into the slag in the form of zinc oxide. The verification test indicates that the main phases in the alloy contain metallic lead, metallic antimony and a small amount of Cu2Sb, FeSb2 intermetallic compounds, and the slag is mainly composed of kirschsteinite, fayalite and zinc oxide, in agreement with the thermodynamic analysis.

  13. Common Factor Analysis Versus Principal Component Analysis: Choice for Symptom Cluster Research

    Directory of Open Access Journals (Sweden)

    Hee-Ju Kim, PhD, RN


    Conclusion: If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research, CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.

  14. Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables. (United States)

    Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo


    Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. The Flemish frozen-vegetable industry as an example of cluster analysis : Flanders Vegetable Valley

    NARCIS (Netherlands)

    Vanhaverbeke, W.P.M.; Larosse, J.; Winnen, W.; Hulsink, W.; Dons, J.J.M.


    In this contribution we present a strategic analysis of the cluster dynamics in the frozen-vegetable industry in Flanders (Belgium)1. The main purpose of this case is twofold. First, we determine the added value of using data about customer and supplier relationships in cluster analysis. Second, we

  16. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach (United States)

    Brown, S. J.; White, S.; Power, N.


    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  17. Performance Analysis of Cluster Formation in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Edgar Romo Montiel


    Full Text Available Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  18. Performance Analysis of Cluster Formation in Wireless Sensor Networks. (United States)

    Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo


    Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  19. Higgs pair production: choosing benchmarks with cluster analysis

    Energy Technology Data Exchange (ETDEWEB)

    Carvalho, Alexandra; Dall’Osso, Martino; Dorigo, Tommaso [Dipartimento di Fisica e Astronomia and INFN, Sezione di Padova,Via Marzolo 8, I-35131 Padova (Italy); Goertz, Florian [CERN,1211 Geneva 23 (Switzerland); Gottardo, Carlo A. [Physikalisches Institut, Universität Bonn,Nussallee 12, 53115 Bonn (Germany); Tosi, Mia [CERN,1211 Geneva 23 (Switzerland)


    New physics theories often depend on a large number of free parameters. The phenomenology they predict for fundamental physics processes is in some cases drastically affected by the precise value of those free parameters, while in other cases is left basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics predicted by different models; a clustering algorithm using that metric may allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmarks are then guaranteed to be sensitive to a large area of the parameter space. In this document we show a practical implementation of the above strategy for the study of non-resonant production of Higgs boson pairs in the context of extensions of the standard model with anomalous couplings of the Higgs bosons. A non-standard value of those couplings may significantly enhance the Higgs boson pair-production cross section, such that the process could be detectable with the data that the LHC will collect in Run 2.

  20. Clusters of galaxies as tools in observational cosmology : results from x-ray analysis

    International Nuclear Information System (INIS)

    Weratschnig, J.M.


    Clusters of galaxies are the largest gravitationally bound structures in the universe. They can be used as ideal tools to study large scale structure formation (e.g. when studying merger clusters) and provide highly interesting environments to analyse several characteristic interaction processes (like ram pressure stripping of galaxies, magnetic fields). In this dissertation thesis, we have studied several clusters of galaxies using X-ray observations. To obtain scientific results, we have applied different data reduction and analysis methods. With a combination of morphological and spectral analysis, the merger cluster Abell 514 was studied in much detail. It has a highly interesting morphology and shows signs for an ongoing merger as well as a shock. using a new method to detect substructure, we have analysed several clusters to determine whether any substructure is present in the X-ray image. This hints towards a real structure in the distribution of the intra-cluster medium (ICM) and is evidence for ongoing mergers. The results from this analysis are extensively used with the cluster of galaxies Abell S1136. Here, we study the ICM distribution and compare its structure with the spatial distribution of star forming galaxies. Cluster magnetic fields are another important topic of my thesis. They can be studied in Radio observations, which can be put into relation with results from X-ray observations. using observational data from several clusters, we could support the theory that cluster magnetic fields are frozen into the ICM. (author)

  1. Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine. (United States)

    Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang


    Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.

  2. Techno-Economic Analysis of a 600 MW Oxy-Enrich Pulverized Coal-Fired Boiler

    Directory of Open Access Journals (Sweden)

    Ming Lei


    Full Text Available Oxy-fuel combustion is one of the most promising methods for CO2 capture and storage (CCS but the operating costs—mainly due to the need for oxygen production—usually lead to an obvious decrease in power generation efficiency. An “oxy-enrich combustion” process was proposed in this study to improve the efficiency of the oxy-fuel combustion process. The oxidizer for oxy-enrich combustion was composed of pure oxygen, air and recycled flue gas. Thus, the CO2 concentration in the flue gas decreased to 30–40%. The PSA (pressure swing adsorption, which has been widely used for CO2 removal from the shifting gases of ammonia synthesis in China, was applied to capture CO2 during oxy-enrich combustion. The technological economics of oxy-enrich combustion with PSA was calculated and compared to that of oxy-fuel combustion. The results indicated that, compared with oxy-fuel combustion: (1 the oxy-enrich combustion has fewer capital and operating costs for the ASU (air separation unit and the recycle fan; (2 there were fewer changes in the components of the flue gas in a furnace for oxy-enrich combustion between dry and wet flue gas circulation; and (3 as the volume ratio of air and oxygen was 2 or 3, the economics of oxy-enrich combustion with PSA were more advantageous.

  3. Phenotypic clustering: a novel method for microglial morphology analysis. (United States)

    Verdonk, Franck; Roux, Pascal; Flamant, Patricia; Fiette, Laurence; Bozza, Fernando A; Simard, Sébastien; Lemaire, Marc; Plaud, Benoit; Shorte, Spencer L; Sharshar, Tarek; Chrétien, Fabrice; Danckaert, Anne


    Microglial cells are tissue-resident macrophages of the central nervous system. They are extremely dynamic, sensitive to their microenvironment and present a characteristic complex and heterogeneous morphology and distribution within the brain tissue. Many experimental clues highlight a strong link between their morphology and their function in response to aggression. However, due to their complex "dendritic-like" aspect that constitutes the major pool of murine microglial cells and their dense network, precise and powerful morphological studies are not easy to realize and complicate correlation with molecular or clinical parameters. Using the knock-in mouse model CX3CR1(GFP/+), we developed a 3D automated confocal tissue imaging system coupled with morphological modelling of many thousands of microglial cells revealing precise and quantitative assessment of major cell features: cell density, cell body area, cytoplasm area and number of primary, secondary and tertiary processes. We determined two morphological criteria that are the complexity index (CI) and the covered environment area (CEA) allowing an innovative approach lying in (i) an accurate and objective study of morphological changes in healthy or pathological condition, (ii) an in situ mapping of the microglial distribution in different neuroanatomical regions and (iii) a study of the clustering of numerous cells, allowing us to discriminate different sub-populations. Our results on more than 20,000 cells by condition confirm at baseline a regional heterogeneity of the microglial distribution and phenotype that persists after induction of neuroinflammation by systemic injection of lipopolysaccharide (LPS). Using clustering analysis, we highlight that, at resting state, microglial cells are distributed in four microglial sub-populations defined by their CI and CEA with a regional pattern and a specific behaviour after challenge. Our results counteract the classical view of a homogenous regional resting

  4. Cluster Computing For Real Time Seismic Array Analysis. (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  5. Global classification of human facial healthy skin using PLS discriminant analysis and clustering analysis. (United States)

    Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J


    Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.

  6. Comparative analysis of clustering methods for gene expression time course data

    Directory of Open Access Journals (Sweden)

    Ivan G. Costa


    Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

  7. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment. (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun


    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  8. Tracking difference in gene expression in a time-course experiment using gene set enrichment analysis.

    Directory of Open Access Journals (Sweden)

    Pui Shan Wong

    Full Text Available Fistulifera sp. strain JPCC DA0580 is a newly sequenced pennate diatom that is capable of simultaneously growing and accumulating lipids. This is a unique trait, not found in other related microalgae so far. It is able to accumulate between 40 to 60% of its cell weight in lipids, making it a strong candidate for the production of biofuel. To investigate this characteristic, we used RNA-Seq data gathered at four different times while Fistulifera sp. strain JPCC DA0580 was grown in oil accumulating and non-oil accumulating conditions. We then adapted gene set enrichment analysis (GSEA to investigate the relationship between the difference in gene expression of 7,822 genes and metabolic functions in our data. We utilized information in the KEGG pathway database to create the gene sets and changed GSEA to use re-sampling so that data from the different time points could be included in the analysis. Our GSEA method identified photosynthesis, lipid synthesis and amino acid synthesis related pathways as processes that play a significant role in oil production and growth in Fistulifera sp. strain JPCC DA0580. In addition to GSEA, we visualized the results by creating a network of compounds and reactions, and plotted the expression data on top of the network. This made existing graph algorithms available to us which we then used to calculate a path that metabolizes glucose into triacylglycerol (TAG in the smallest number of steps. By visualizing the data this way, we observed a separate up-regulation of genes at different times instead of a concerted response. We also identified two metabolic paths that used less reactions than the one shown in KEGG and showed that the reactions were up-regulated during the experiment. The combination of analysis and visualization methods successfully analyzed time-course data, identified important metabolic pathways and provided new hypotheses for further research.

  9. Epigenomic annotation-based interpretation of genomic data: from enrichment analysis to machine learning. (United States)

    Dozmorov, Mikhail G


    One of the goals of functional genomics is to understand the regulatory implications of experimentally obtained genomic regions of interest (ROIs). Most sequencing technologies now generate ROIs distributed across the whole genome. The interpretation of these genome-wide ROIs represents a challenge as the majority of them lie outside of functionally well-defined protein coding regions. Recent efforts by the members of the International Human Epigenome Consortium have generated volumes of functional/regulatory data (reference epigenomic datasets), effectively annotating the genome with epigenomic properties. Consequently, a wide variety of computational tools has been developed utilizing these epigenomic datasets for the interpretation of genomic data. The purpose of this review is to provide a structured overview of practical solutions for the interpretation of ROIs with the help of epigenomic data. Starting with epigenomic enrichment analysis, we discuss leading tools and machine learning methods utilizing epigenomic and 3D genome structure data. The hierarchy of tools and methods reviewed here presents a practical guide for the interpretation of genome-wide ROIs within an epigenomic context. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail:

  10. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis. (United States)

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A


    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  11. Comparison of Outputs for Variable Combinations Used in Cluster Analysis on Polarmetric Imagery

    National Research Council Canada - National Science Library

    Petre, Melinda


    .... More specifically, two techniques, Cluster Analysis (CA) and Principle Component Analysis (PCA) can be combined to process Stoke s imagery by distinguishing between pixels, and producing groups of pixels with similar characteristics...

  12. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis. (United States)

    Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard


    Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, ppeople living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.

  13. The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions (United States)

    Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.


    The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.

  14. Silver surface enrichment controlled by simultaneous RBS for reliable PIXE analysis of ancient coins

    International Nuclear Information System (INIS)

    Beck, L.; Alloin, E.; Berthier, C.; Reveillon, S.; Costa, V.


    Evidence of silver surface enrichment of ancient silver-copper coins has been pointed out in the past years. Surface enrichment can be fortuitous or intentional. In this paper, we have investigated the cleaning procedures usually performed after excavation or in museums. We have shown that chemicals or commercial products routinely used dissolve preferentially the copper phase and consequently contribute to the silver surface enrichment. As a result, surface analyses such as PIXE or XRF can be strongly affected by this effect. By using simultaneously RBS and PIXE, it is possible to check through the silver surface enrichment and then select the reliable measurements, characteristic of the bulk composition. Results on coins recently discovered and mechanically or chemically cleaned are presented

  15. Immune-related genetic enrichment in frontotemporal dementia: An analysis of genome-wide association studies. (United States)

    Broce, Iris; Karch, Celeste M; Wen, Natalie; Fan, Chun C; Wang, Yunpeng; Tan, Chin Hong; Kouri, Naomi; Ross, Owen A; Höglinger, Günter U; Muller, Ulrich; Hardy, John; Momeni, Parastoo; Hess, Christopher P; Dillon, William P; Miller, Zachary A; Bonham, Luke W; Rabinovici, Gil D; Rosen, Howard J; Schellenberg, Gerard D; Franke, Andre; Karlsen, Tom H; Veldink, Jan H; Ferrari, Raffaele; Yokoyama, Jennifer S; Miller, Bruce L; Andreassen, Ole A; Dale, Anders M; Desikan, Rahul S; Sugrue, Leo P


    Converging evidence suggests that immune-mediated dysfunction plays an important role in the pathogenesis of frontotemporal dementia (FTD). Although genetic studies have shown that immune-associated loci are associated with increased FTD risk, a systematic investigation of genetic overlap between immune-mediated diseases and the spectrum of FTD-related disorders has not been performed. Using large genome-wide association studies (GWASs) (total n = 192,886 cases and controls) and recently developed tools to quantify genetic overlap/pleiotropy, we systematically identified single nucleotide polymorphisms (SNPs) jointly associated with FTD-related disorders-namely, FTD, corticobasal degeneration (CBD), progressive supranuclear palsy (PSP), and amyotrophic lateral sclerosis (ALS)-and 1 or more immune-mediated diseases including Crohn disease, ulcerative colitis (UC), rheumatoid arthritis (RA), type 1 diabetes (T1D), celiac disease (CeD), and psoriasis. We found up to 270-fold genetic enrichment between FTD and RA, up to 160-fold genetic enrichment between FTD and UC, up to 180-fold genetic enrichment between FTD and T1D, and up to 175-fold genetic enrichment between FTD and CeD. In contrast, for CBD and PSP, only 1 of the 6 immune-mediated diseases produced genetic enrichment comparable to that seen for FTD, with up to 150-fold genetic enrichment between CBD and CeD and up to 180-fold enrichment between PSP and RA. Further, we found minimal enrichment between ALS and the immune-mediated diseases tested, with the highest levels of enrichment between ALS and RA (up to 20-fold). For FTD, at a conjunction false discovery rate enriched in microglia/macrophages compared to other central nervous system cell types. The main study limitation is that the results represent only clinically diagnosed individuals. Also, given the complex interconnectedness of the HLA region, we were not able to define the specific gene or genes on Chr 6 responsible for our pleiotropic signal. We

  16. Optimized biotin-hydrazide enrichment and mass spectrometry analysis of peptide carbonyls

    DEFF Research Database (Denmark)

    Havelund, Jesper F.; Wojdyla, K; Jensen, O. N.

    Irreversible cell damage through protein carbonylation is the result of reaction with reactive oxygen species (ROS) and has been coupled to many diseases. The precise molecular consequences of protein carbonylation, however, are still not clear. The localization of the carbonylated amino acid is ...... modifications are isobaric to carbonylation and it is often challenging to detect the weaker signal from carbonylated peptides necessitating enrichment step. We here present an optimized method for the enrichment of carbonylated peptides....

  17. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus]. (United States)

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie


    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  18. Global myeloma research clusters, output, and citations: a bibliometric mapping and clustering analysis.

    Directory of Open Access Journals (Sweden)

    Jens Peter Andersen

    Full Text Available International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases.We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005-2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013.Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care.

  19. Topic modeling for cluster analysis of large biological and medical datasets. (United States)

    Zhao, Weizhong; Zou, Wen; Chen, James J


    The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting

  20. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis. (United States)

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik


    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease

  1. Cluster analysis of tropical cyclone tracks in the Southern Hemisphere

    Energy Technology Data Exchange (ETDEWEB)

    Ramsay, Hamish A. [Monash University, Monash Weather and Climate, School of Mathematical Sciences, Clayton, VIC (Australia); Camargo, Suzana J.; Kim, Daehyun [Columbia University, Lamont-Doherty Earth Observatory, Palisades, NY (United States)


    A probabilistic clustering method is used to describe various aspects of tropical cyclone (TC) tracks in the Southern Hemisphere, for the period 1969-2008. A total of 7 clusters are examined: three in the South Indian Ocean, three in the Australian Region, and one in the South Pacific Ocean. Large-scale environmental variables related to TC genesis in each cluster are explored, including sea surface temperature, low-level relative vorticity, deep-layer vertical wind shear, outgoing longwave radiation, El Nino-Southern Oscillation (ENSO) and the Madden-Julian Oscillation (MJO). Composite maps, constructed 2 days prior to genesis, show some of these to be significant precursors to TC formation - most prominently, westerly wind anomalies equatorward of the main development regions. Clusters are also evaluated with respect to their genesis location, seasonality, mean peak intensity, track duration, landfall location, and intensity at landfall. ENSO is found to play a significant role in modulating annual frequency and mean genesis location in three of the seven clusters (two in the South Indian Ocean and one in the Pacific). The ENSO-modulating effect on genesis frequency is caused primarily by changes in low-level zonal flow between the equator and 10 S, and associated relative vorticity changes in the main development regions. ENSO also has a significant effect on mean genesis location in three clusters, with TCs forming further equatorward (poleward) during El Nino (La Nina) in addition to large shifts in mean longitude. The MJO has a strong influence on TC genesis in all clusters, though the amount modulation is found to be sensitive to the definition of the MJO. (orig.)

  2. A cross-study gene set enrichment analysis identifies critical pathways in endometriosis

    Directory of Open Access Journals (Sweden)

    Bai Chunyan


    Full Text Available Abstract Background Endometriosis is an enigmatic disease. Gene expression profiling of endometriosis has been used in several studies, but few studies went further to classify subtypes of endometriosis based on expression patterns and to identify possible pathways involved in endometriosis. Some of the observed pathways are more inconsistent between the studies, and these candidate pathways presumably only represent a fraction of the pathways involved in endometriosis. Methods We applied a standardised microarray preprocessing and gene set enrichment analysis to six independent studies, and demonstrated increased concordance between these gene datasets. Results We find 16 up-regulated and 19 down-regulated pathways common in ovarian endometriosis data sets, 22 up-regulated and one down-regulated pathway common in peritoneal endometriosis data sets. Among them, 12 up-regulated and 1 down-regulated were found consistent between ovarian and peritoneal endometriosis. The main canonical pathways identified are related to immunological and inflammatory disease. Early secretory phase has the most over-represented pathways in the three uterine cycle phases. There are no overlapping significant pathways between the dataset from human endometrial endothelial cells and the datasets from ovarian endometriosis which used whole tissues. Conclusion The study of complex diseases through pathway analysis is able to highlight genes weakly connected to the phenotype which may be difficult to detect by using classical univariate statistics. By standardised microarray preprocessing and GSEA, we have increased the concordance in identifying many biological mechanisms involved in endometriosis. The identified gene pathways will shed light on the understanding of endometriosis and promote the development of novel therapies.

  3. A study on friability, hardness and fiber content analysis of fiber enriched milk tablet (United States)

    Suzihaque, M. U. H.; Irfan, M. H.; Ibrahim, U. K.


    This study was performed to analyze the friability, hardness and fiber content of fiber enriched milk tablet derived from five different local fiber sources such as carrot, spinach, dragon fruit, mango and watermelon. Cow milk was mixed to complement with the tablet as a protein source. The powder were spray dried at 100°C, 120°C and 140°C and freeze dried at -60°C. The mixture of fruits and milk were made into equal ratio with the addition of 15 maltodextrin as a carrier. Tablets formed were used for friability and hardness test while dried powder were used for fiber content analysis. Dragon fruit tablet dried at 140°C have the highest friability with 11. 42 of weight loss. The second highest friability was spinach tablet dried at 100°C and 120°C drying temp erature with 9.30 and 9.28 respectively. The lowest friability was exhibited by carrot, mango and watermelon tablet at 100°C and dragon fruit at 120°C while carrot and spinach at 140°C. In contras t, none of the freeze dried tablets showed any weight loss hence they are not friable. For hardness test, all of the freeze dried showed to have higher tensile strength than spray dried, where carrot showed to be the highest at 2.27 Newton and the lowest were spray dried mango at 0.16 Newton. In fiber content analysis, freeze dried mango have the highest fiber content followed by freeze dried carrot and 140°C s pray dried carrot. It can be concluded that the higher the spray dry temperature, the more friable is the tablet. While, high friability leads to lower hardness of tablets. In terms of fiber content, the higher the spray dry temperature, the lower the fiber content found.

  4. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin


    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  5. Cluster-cluster clustering

    International Nuclear Information System (INIS)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)


    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references

  6. Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions. (United States)

    Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey


    We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians

  7. Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACĂ


    Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.

  8. Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Directory of Open Access Journals (Sweden)

    Martinez Fernando J


    Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.

  9. Implementation trial of high performance trace analysis/environmental sampling (HPTA/ES) in uranium centrifuge enrichment plants

    International Nuclear Information System (INIS)

    Nackaerts, H.; Kloeckner, W.; Landresse, G.; MacLean, F.; Betti, M.; Forcina, V.; Hiernaut, T.; Tamborini, G.; Koch, L.; Schenkel, R.


    Field trials have demonstrated that the analysis of particles upon swipes obtained from inside nuclear installations provides clear signatures of past operations in that installation. This can offer a valuable tool for gaining assurance regarding the compliance with declared activities and the absence of undeclared activities (e.g. enrichment, reprocessing, and reactor operation) at such sites. This method, known as 'Environmental Sampling' (ES) or 'High Performance Trace Analysis' (HPTA) in EURATOM terminology, is at present being evaluated by the EURATOM Safeguards Directorate (ESD) in order to assess its possible use in nuclear installations within the European Union. It is expected that incorporation of HPTA/ES of sample collection and analysis into routine inspection activities will allow EURATOM to improve the effectiveness of safeguards in these installations and hopefully save inspection resources as well. The EURATOM Safeguards Directorate has therefore performed implementation trials involving the collection of particles by the so-called swipe sampling method in uranium centrifuge enrichment plants and hot cells in the European Union. These samples were subsequently analysed by the Joint Research Centre, Institute for Transuranium Elements (ITU) in Karlsruhe. Sampling points were chosen on the basis of the activities performed in the vicinity and by considering the possible ways through which particles are released, diffused and transported. The aim was to test the efficiency of the method as regards: the collection of enough representative material; the identification of a large enough number of uranium particles; the accurate measurement of the enrichment of the uranium particles found on the swipe; the representativity of the results in respect of past activities in the plant; the capability of detecting whether highly enriched uranium has been produced, used or occasionally transported in a location where low enriched uranium is routinely produced in

  10. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    Energy Technology Data Exchange (ETDEWEB)

    Lalonde, Michel, E-mail:; Wassenaar, Richard [Department of Physics, Carleton University, Ottawa, Ontario K1S 5B6 (Canada); Wells, R. Glenn; Birnie, David; Ruddy, Terrence D. [Division of Cardiology, University of Ottawa Heart Institute, Ottawa, Ontario K1Y 4W7 (Canada)


    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  11. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    International Nuclear Information System (INIS)

    Lalonde, Michel; Wassenaar, Richard; Wells, R. Glenn; Birnie, David; Ruddy, Terrence D.


    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  12. Nurses' beliefs about nursing diagnosis: A study with cluster analysis. (United States)

    D'Agostino, Fabio; Pancani, Luca; Romero-Sánchez, José Manuel; Lumillo-Gutierrez, Iris; Paloma-Castro, Olga; Vellone, Ercole; Alvaro, Rosaria


    To identify clusters of nurses in relation to their beliefs about nursing diagnosis among two populations (Italian and Spanish); to investigate differences among clusters of nurses in each population considering the nurses' socio-demographic data, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and actual behaviours in making nursing diagnosis. Nurses' beliefs concerning nursing diagnosis can influence its use in practice but this is still unclear. A cross-sectional design. A convenience sample of nurses in Italy and Spain was enrolled. Data were collected between 2014-2015 using tools, that is, a socio-demographic questionnaire and behavioural, normative and control beliefs, attitudes, intentions and behaviours scales. The sample included 499 nurses (272 Italians & 227 Spanish). Of these, 66.5% of the Italian and 90.7% of the Spanish sample were female. The mean age was 36.5 and 45.2 years old in the Italian and Spanish sample respectively. Six clusters of nurses were identified in Spain and four in Italy. Three clusters were similar among the two populations. Similar significant associations between age, years of work, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and behaviours in making nursing diagnosis and cluster membership in each population were identified. Belief profiles identified unique subsets of nurses that have distinct characteristics. Categorizing nurses by belief patterns may help administrators and educators to tailor interventions aimed at improving nursing diagnosis use in practice. © 2018 John Wiley & Sons Ltd.

  13. Cluster Analysis of Customer Reviews Extracted from Web Pages

    Directory of Open Access Journals (Sweden)

    S. Shivashankar


    Full Text Available As e-commerce is gaining popularity day by day, the web has become an excellent source for gathering customer reviews / opinions by the market researchers. The number of customer reviews that a product receives is growing at very fast rate (It could be in hundreds or thousands. Customer reviews posted on the websites vary greatly in quality. The potential customer has to read necessarily all the reviews irrespective of their quality to make a decision on whether to purchase the product or not. In this paper, we make an attempt to assess are view based on its quality, to help the customer make a proper buying decision. The quality of customer review is assessed as most significant, more significant, significant and insignificant.A novel and effective web mining technique is proposed for assessing a customer review of a particular product based on the feature clustering techniques, namely, k-means method and fuzzy c-means method. This is performed in three steps : (1Identify review regions and extract reviews from it, (2 Extract and cluster the features of reviews by a clustering technique and then assign weights to the features belonging to each of the clusters (groups and (3 Assess the review by considering the feature weights and group belongingness. The k-means and fuzzy c-means clustering techniques are implemented and tested on customer reviews extracted from web pages. Performance of these techniques are analyzed.

  14. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  15. Clustering analysis of malware behavior using Self Organizing Map

    DEFF Research Database (Denmark)

    Pirscoveanu, Radu-Stefan; Stevanovic, Matija; Pedersen, Jens Myrup


    For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing...... Map, an unsupervised machine learning algorithm, for generating clusters that capture the similarities between malware behavior. A data set of approximately 270,000 samples was used to generate the behavioral profile of malicious types in order to compare the outcome of the proposed clustering...... approach with the labels collected from 57 Antivirus vendors using VirusTotal. Upon evaluating the results, the paper concludes on shortcomings of relying on AV vendors for labeling malware samples. In order to solve the problem, a cluster-based classification is proposed, which should provide more...

  16. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis (United States)

    Ho, Hsuan-Fu; Hung, Chia-Chi


    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  17. Influence of birth cohort on age of onset cluster analysis in bipolar I disorder

    DEFF Research Database (Denmark)

    Bauer, M; Glenn, T; Alda, M


    Purpose: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset...... cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. Results: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After...... on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more...

  18. Parkinson's Disease Subtypes Identified from Cluster Analysis of Motor and Non-motor Symptoms. (United States)

    Mu, Jesse; Chaudhuri, Kallol R; Bielza, Concha; de Pedro-Cuesta, Jesus; Larrañaga, Pedro; Martinez-Martin, Pablo


    Parkinson's disease is now considered a complex, multi-peptide, central, and peripheral nervous system disorder with considerable clinical heterogeneity. Non-motor symptoms play a key role in the trajectory of Parkinson's disease, from prodromal premotor to end stages. To understand the clinical heterogeneity of Parkinson's disease, this study used cluster analysis to search for subtypes from a large, multi-center, international, and well-characterized cohort of Parkinson's disease patients across all motor stages, using a combination of cardinal motor features (bradykinesia, rigidity, tremor, axial signs) and, for the first time, specific validated rater-based non-motor symptom scales. Two independent international cohort studies were used: (a) the validation study of the Non-Motor Symptoms Scale ( n = 411) and (b) baseline data from the global Non-Motor International Longitudinal Study ( n = 540). k -means cluster analyses were performed on the non-motor and motor domains (domains clustering) and the 30 individual non-motor symptoms alone (symptoms clustering), and hierarchical agglomerative clustering was performed to group symptoms together. Four clusters are identified from the domains clustering supporting previous studies: mild, non-motor dominant, motor-dominant, and severe. In addition, six new smaller clusters are identified from the symptoms clustering, each characterized by clinically-relevant non-motor symptoms. The clusters identified in this study present statistical confirmation of the increasingly important role of non-motor symptoms (NMS) in Parkinson's disease heterogeneity and take steps toward subtype-specific treatment packages.

  19. iterClust: a statistical framework for iterative clustering analysis. (United States)

    Ding, Hongxu; Wang, Wanxin; Califano, Andrea


    In a scenario where populations A, B1 and B2 (subpopulations of B) exist, pronounced differences between A and B may mask subtle differences between B1 and B2. Here we present iterClust, an iterative clustering framework, which can separate more pronounced differences (e.g. A and B) in starting iterations, followed by relatively subtle differences (e.g. B1 and B2), providing a comprehensive clustering trajectory. iterClust is implemented as a Bioconductor R package., Supplementary information is available at Bioinformatics online.

  20. Dynamic analysis of clustered building structures using substructures methods

    International Nuclear Information System (INIS)

    Leimbach, K.R.; Krutzik, N.J.


    The dynamic substructure approach to the building cluster on a common base mat starts with the generation of Ritz-vectors for each building on a rigid foundation. The base mat plus the foundation soil is subjected to kinematic constraint modes, for example constant, linear, quadratic or cubic constraints. These constraint modes are also imposed on the buildings. By enforcing kinematic compatibility of the complete structural system on the basis of the constraint modes a reduced Ritz model of the complete cluster is obtained. This reduced model can now be analyzed by modal time history or response spectrum methods

  1. Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics (United States)

    Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.


    We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…

  2. Differences Between Ward's and UPGMA Methods of Cluster Analysis: Implications for School Psychology. (United States)

    Hale, Robert L.; Dougherty, Donna


    Compared the efficacy of two methods of cluster analysis, the unweighted pair-groups method using arithmetic averages (UPGMA) and Ward's method, for students grouped on intelligence, achievement, and social adjustment by both clustering methods. Found UPGMA more efficacious based on output, on cophenetic correlation coefficients generated by each…

  3. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution ...

  4. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.


    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  5. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.


    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  6. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning (United States)

    Firdausiah Mansur, Andi Besse; Yusof, Norazah


    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  7. Development of a versatile enrichment analysis tool reveals associations between the maternal brain and mental health disorders, including autism (United States)


    postpartum brain may provide a novel and promising platform for understanding the complex genetics of improved sociability that may have direct relevance for multiple psychiatric illnesses. This study also provides an important new tool that fills a critical analysis gap and makes evaluation of enrichment using any database of interest possible with an emphasis on ease of use and methodological transparency. PMID:24245670

  8. Phylogenetic analysis of TCE-dechlorinating consortia enriched on a variety of electron donors. (United States)

    Freeborn, Ryan A; West, Kimberlee A; Bhupathiraju, Vishvesh K; Chauhan, Sadhana; Rahm, Brian G; Richardson, Ruth E; Alvarez-Cohen, Lisa


    Two rapidly fermented electron donors, lactate and methanol, and two slowly fermented electron donors, propionate and butyrate, were selected for enrichment studies to evaluate the characteristics of anaerobic microbial consortia that reductively dechlorinate TCE to ethene. Each electron donor enrichment subculture demonstrated the ability to dechlorinate TCE to ethene through several serial transfers. Microbial community analyses based upon 16S rDNA, including terminal restriction fragment length polymorphism (T-RFLP) and clone library/sequencing, were performed to assess major changes in microbial community structure associated with electron donors capable of stimulating reductive dechlorination. Results demonstrated that five phylogenic subgroups or genera of bacteria were present in all consortia, including Dehalococcoides sp., low G+C Gram-positives (mostly Clostridium and Eubacterium sp.), Bacteroides sp., Citrobacter sp., and delta Proteobacteria (mostly Desulfovibrio sp.). Phylogenetic association indicates that only minor shifts in the microbial community structure occurred between the four alternate electron donor enrichments and the parent consortium. Inconsistent detection of Dehalococcoides spp. in clone libraries and T-RFLP of enrichment subcultures was resolved using quantitative polymerase chain reaction (Q-PCR). Q-PCR with primers specific to Dehalococcoides 16S rDNA resulted in positive detection of this species in all enrichments. Our results suggest that TCE-dechlorinating consortia can be stably maintained on a variety of electron donors and that quantities of Dehalococcoides cells detected with Dehalococcoides specific 16S rDNA primer/probe sets do not necessarily correlate well with solvent degradation rates.

  9. Work-family enrichment, work-family conflict, and marital satisfaction: a dyadic analysis. (United States)

    van Steenbergen, Elianne F; Kluwer, Esther S; Karney, Benjamin R


    This study was designed to examine whether spouses' work-to-family (WF) enrichment experiences account for their own and their partner's marital satisfaction, beyond the effects of WF conflict. Data were collected from both partners of 215 dual-earner couples with children. As hypothesized, structural equation modeling revealed that WF enrichment experiences accounted for variance in individuals' marital satisfaction, over and above WF conflict. In line with our predictions, this positive link between individuals' WF enrichment and their marital satisfaction was mediated by more positive marital behavior, and more positive perceptions of the partner's behavior. Furthermore, evidence for crossover was found. Husbands who experienced more WF enrichment were found to show more marital positivity (according to their wives), which related to increased marital satisfaction in their wives. No evidence of such a crossover effect from wives to husbands was found. The current findings not only highlight the added value of studying positive spillover and crossover effects of work into the marriage, but also suggest that positive spillover and crossover effects on marital satisfaction might be stronger than negative spillover and crossover are. These results imply that organizational initiatives of increasing job enrichment may make employees' marital life happier and can contribute to a happy, healthy, and high-performing workforce.

  10. Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis. (United States)

    Conley, Samantha


    The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.

  11. Molecular characterization and transcriptional analysis of the female-enriched chondroitin proteoglycan 2 of Toxocara canis. (United States)

    Ma, G X; Zhou, R Q; Hu, L; Luo, Y L; Luo, Y F; Zhu, H H


    Toxocara canis is an important but neglected zoonotic parasite, and is the causative agent of human toxocariasis. Chondroitin proteoglycans are biological macromolecules, widely distributed in extracellular matrices, with a great diversity of functions in mammals. However, there is limited information regarding chondroitin proteoglycans in nematode parasites. In the present study, a female-enriched chondroitin proteoglycan 2 gene of T. canis (Tc-cpg-2) was cloned and characterized. Quantitative real-time polymerase chain reaction (qRT-PCR) was employed to measure the transcription levels of Tc-cpg-2 among tissues of male and female adult worms. A 485-amino-acid (aa) polypeptide was predicted from a continuous 1458-nuleotide open reading frame and designated as TcCPG2, which contains a 21-aa signal peptide. Conserved domain searching indicated three chitin-binding peritrophin-A (CBM_14) domains in the amino acid sequence of TcCPG2. Multiple alignment with the inferred amino acid sequences of Caenorhabditis elegans and Ascaris suum showed that CBM_14 domains were well conserved among these species. Phylogenetic analysis suggested that TcCPG2 was closely related to the sequence of chondroitin proteoglycan 2 of A. suum. Interestingly, a high level of Tc-cpg-2 was detected in female germline tissues, particularly in the oviduct, suggesting potential roles of this gene in reproduction (e.g. oogenesis and embryogenesis) of adult T. canis. The functional roles of Tc-cpg-2 in reproduction and development in this parasite and related parasitic nematodes warrant further functional studies.

  12. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability. (United States)

    Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R


    To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: © 2016 Associated Professional Sleep Societies, LLC.

  13. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes. (United States)

    Leung, S C; Fung, W K; Wong, K H


    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  14. Clustering Analysis for Credit Default Probabilities in a Retail Bank Portfolio

    Directory of Open Access Journals (Sweden)



    Full Text Available Methods underlying cluster analysis are very useful in data analysis, especially when the processed volume of data is very large, so that it becomes impossible to extract essential information, unless specific instruments are used to summarize and structure the gross information. In this context, cluster analysis techniques are used particularly, for systematic information analysis. The aim of this article is to build an useful model for banking field, based on data mining techniques, by dividing the groups of borrowers into clusters, in order to obtain a profile of the customers (debtors and good payers. We assume that a class is appropriate if it contains members that have a high degree of similarity and the standard method for measuring the similarity within a group shows the lowest variance. After clustering, data mining techniques are implemented on the cluster with bad debtors, reaching a very high accuracy after implementation. The paper is structured as follows: Section 2 describes the model for data analysis based on a specific scoring model that we proposed. In section 3, we present a cluster analysis using K-means algorithm and the DM models are applied on a specific cluster. Section 4 shows the conclusions.

  15. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach. (United States)

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian


    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  16. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

    Directory of Open Access Journals (Sweden)

    Huanhuan Li


    Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our

  17. Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes. (United States)

    Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F


    Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis. (United States)

    Li, Huanhuan; Liu, Jingxian; Liu, Ryan Wen; Xiong, Naixue; Wu, Kefeng; Kim, Tai-Hoon


    The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with

  19. Molecular-dynamics analysis of mobile helium cluster reactions near surfaces of plasma-exposed tungsten

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Lin; Maroudas, Dimitrios, E-mail: [Department of Chemical Engineering, University of Massachusetts, Amherst, Massachusetts 01003-9303 (United States); Hammond, Karl D. [Department of Chemical Engineering, University of Missouri, Columbia, Missouri 65211 (United States); Wirth, Brian D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, Tennessee 37996 (United States)


    We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (He{sub n}, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes of helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He{sub 4} and He{sub 5} clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks.

  20. Bacterial community analysis in chlorpyrifos enrichment cultures via DGGE and use of bacterial consortium for CP biodegradation. (United States)

    Akbar, Shamsa; Sultan, Sikander; Kertesz, Michael


    The organophosphate pesticide chlorpyrifos (CP) has been used extensively since the 1960s for insect control. However, its toxic effects on mammals and persistence in environment necessitate its removal from contaminated sites, biodegradation studies of CP-degrading microbes are therefore of immense importance. Samples from a Pakistani agricultural soil with an extensive history of CP application were used to prepare enrichment cultures using CP as sole carbon source for bacterial community analysis and isolation of CP metabolizing bacteria. Bacterial community analysis (denaturing gradient gel electrophoresis) revealed that the dominant genera enriched under these conditions were Pseudomonas, Acinetobacter and Stenotrophomonas, along with lower numbers of Sphingomonas, Agrobacterium and Burkholderia. Furthermore, it revealed that members of Bacteroidetes, Firmicutes, α- and γ-Proteobacteria and Actinobacteria were present at initial steps of enrichment whereas β-Proteobacteria appeared in later steps and only Proteobacteria were selected by enrichment culturing. However, when CP-degrading strains were isolated from this enrichment culture, the most active organisms were strains of Acinetobacter calcoaceticus, Pseudomonas mendocina and Pseudomonas aeruginosa. These strains degraded 6-7.4 mg L(-1) day(-1) of CP when cultivated in mineral medium, while the consortium of all four strains degraded 9.2 mg L(-1) day(-1) of CP (100 mg L(-1)). Addition of glucose as an additional C source increased the degradation capacity by 8-14 %. After inoculation of contaminated soil with CP (200 mg kg(-1)) disappearance rates were 3.83-4.30 mg kg(-1) day(-1) for individual strains and 4.76 mg kg(-1) day(-1) for the consortium. These results indicate that these organisms are involved in the degradation of CP in soil and represent valuable candidates for in situ bioremediation of contaminated soils and waters.

  1. Is Omega-3 Fatty Acids Enriched Nutrition Support Safe for Critical Ill Patients? A Systematic Review and Meta-Analysis

    Directory of Open Access Journals (Sweden)

    Wei Chen


    Full Text Available Objective: To systematically review the effects of omega-3 poly unsaturated fatty acids (FA enriched nutrition support on the mortality of critically illness patients. Methods: Databases of Medline, ISI, Cochrane Library, and Chinese Biomedicine Database were searched and randomized controlled trials (RCTs were identified. We enrolled RCTs that compared fish oil enriched nutrition support and standard nutrition support. Major outcome is mortality. Methodological quality assessment was conducted based on Modified Jadad’s score scale. For control heterogeneity, we developed a method that integrated I2 test, nutritional support route subgroup analysis and clinical condition of severity. RevMan 5.0 software (The Nordic Cochrane Centre, Copenhagen, Denmark was used for meta-analysis. Results: Twelve trials involving 1208 patients that met all the inclusion criteria. Heterogeneity existed between the trials. A random model was used, there was no significant effect on mortality RR, 0.82, 95% confidence interval (CI (0.62, 1.09, p = 0.18. Knowing that the route of fish oil administration may affect heterogeneity, we categorized the trials into two sub-groups: parenteral administration (PN of omega-3 and enteral administration (EN of omega-3. Six trials administered omega-3 FA through PN. Pooled results indicated that omega-3 FA had no significant effect on mortality, RR 0.76, 95% CI (0.52, 1.10, p = 0.15. Six trials used omega-3 fatty acids enriched EN. After excluded one trial that was identified as source of heterogeneity, pooled data indicated omega-3 FA enriched EN significant reduce mortality, RR=0.69, 95% CI [0.53, 0.91] (p = 0.007. Conclusion: Omega-3 FA enriched nutrition support is safe. Due to the limited sample size of the included trials, further large-scale RCTs are needed.

  2. Crowd Analysis by Using Optical Flow and Density Based Clustering

    DEFF Research Database (Denmark)

    Santoro, Francesco; Pedro, Sergio; Tan, Zheng-Hua


    In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step...

  3. Enrichment of tumor cells for cell kinetic analysis in human tumor biopsies using cytokeratin gating

    International Nuclear Information System (INIS)

    Haustermans, K.; Hofland, I.; Ramaekers, M.; Ivanyi, D.; Balm, A.J.M.; Geboes, K.; Lerut, T.; Schueren, E. van der; Begg, A.C.


    Purpose: To determine the feasibility of using cytokeratin antibodies to distinguish normal and malignant cells in human tumors using flow cytometry. The goal was ultimately to increase the accuracy of cell kinetic measurements on human tumor biopsies. Material and methods: A panel of four antibodies was screened on a series of 48 tumors from two centres; 22 head and neck tumors (Amsterdam) and 26 esophagus carcinomas (Leuven). First, screening was carried out by immunohistochemistry on frozen sections to test intensity of staining and the fraction of cytokeratin-positive tumor cells. The antibody showing the most positive staining was then used for flow cytometry on the same tumor. Results: The two broadest spectrum antibodies (AE1/AE3, E3/C4) showed overall the best results with immunohistochemical staining, being positive in over 95% of tumors. Good cell suspensions for DNA flow cytometry could be made from frozen material by a mechanical method, whereas enzymatic methods with trypsin or collagenase were judged failures in almost all cases. >From fresh material, both collagenase and trypsin produced good suspensions for flow cytometry, although the fraction of tumor cells, judged by proportion aneuploid cells, was markedly higher for trypsin. Using the best cytokeratin antibody for each tumor, two parameter flow cytometry was done (cytokeratin versus DNA content). Enrichment of tumor cells was then tested by measuring the fraction of aneuploid cells (the presumed malignant population) of cytokeratin-positive cells versus all cells. An enrichment factor ranging between 0 (no enrichment) and 1 (perfect enrichment, tumor cells only) was then calculated. The average enrichment was 0.60 for head and neck tumors and 0.59 for esophagus tumors. Conclusions: We conclude that this method can substantially enrich the proportion of tumor cells in biopsies from carcinomas. Application of this method could significantly enhance accuracy of tumor cell kinetic measurements

  4. Isotope enrichment

    International Nuclear Information System (INIS)

    Lydtin, H-J.; Wilden, R.J.; Severin, P.J.W.


    The isotope enrichment method described is based on the recognition that, owing to mass diffusion and thermal diffusion in the conversion of substances at a heated substrate while depositing an element or compound onto the substrate, enrichment of the element, or a compound of the element, with a lighter isotope will occur. The cycle is repeated for as many times as is necessary to obtain the degree of enrichment required

  5. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina


    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  6. An isotopic analysis system for plutonium samples enriched in 238Pu

    International Nuclear Information System (INIS)

    Ruhter, W.D.; Camp, D.C.


    We have designed and built a gamma-ray spectrometer system that measures the relative plutonium isotopic abundances of plutonium oxide enriched in 238 Pu. The first system installed at Westinghouse Savannah River Company was tested and evaluated on plutonium oxide in stainless steel EP60/61 containers. 238 Pu enrichments ranged from 20% to 85%. Results show that 200 grams of plutonium oxide in an EP60.61 container can be measured with ±0.3% precision and better than ±1.0% accuracy in the specific power using a counting time of 50 minutes. 3 refs., 2 figs

  7. Analysis of copper-nickel ores by gamma-gamma method in ore enrichment works

    International Nuclear Information System (INIS)

    Bol'shakov, A.Yu.; Tovstenko, Yu.G.; Chinskij, E.B.; Eliseev, G.I.


    The paper presents experimental data on continuous gamma-gamma assay of copper-nickel ores on conveyor belts and of dry discrete samples of classifier overflow at the enrichment plants of the Pechenganikel' group. The relative errors are given of the results of comparison of two-hour rapid analyses and shift and 24-hour chemical analyses of classifier overflow samples with the figures for gamma-gamma assay. The factors affecting the accuracy of the latter technique are elucidated. Practical recommendations are given on the use of this technique at the above enrichment plants. (author)

  8. Performance comparison analysis library communication cluster system using merge sort (United States)

    Wulandari, D. A. R.; Ramadhan, M. E.


    Begins by using a single processor, to increase the speed of computing time, the use of multi-processor was introduced. The second paradigm is known as parallel computing, example cluster. The cluster must have the communication potocol for processing, one of it is message passing Interface (MPI). MPI have many library, both of them OPENMPI and MPICH2. Performance of the cluster machine depend on suitable between performance characters of library communication and characters of the problem so this study aims to analyze the comparative performances libraries in handling parallel computing process. The case study in this research are MPICH2 and OpenMPI. This case research execute sorting’s problem to know the performance of cluster system. The sorting problem use mergesort method. The research method is by implementing OpenMPI and MPICH2 on a Linux-based cluster by using five computer virtual then analyze the performance of the system by different scenario tests and three parameters for to know the performance of MPICH2 and OpenMPI. These performances are execution time, speedup and efficiency. The results of this study showed that the addition of each data size makes OpenMPI and MPICH2 have an average speed-up and efficiency tend to increase but at a large data size decreases. increased data size doesn’t necessarily increased speed up and efficiency but only execution time example in 100000 data size. OpenMPI has a execution time greater than MPICH2 example in 1000 data size average execution time with MPICH2 is 0,009721 and OpenMPI is 0,003895 OpenMPI can customize communication needs.


    Energy Technology Data Exchange (ETDEWEB)

    Johnson, Christian I.; Caldwell, Nelson [Harvard–Smithsonian Center for Astrophysics, 60 Garden Street, MS-15, Cambridge, MA 02138 (United States); Rich, R. Michael [Department of Physics and Astronomy, UCLA, 430 Portola Plaza, Box 951547, Los Angeles, CA 90095-1547 (United States); Pilachowski, Catherine A. [Astronomy Department, Indiana University Bloomington, Swain West 319, 727 East 3rd Street, Bloomington, IN 47405-7105 (United States); Mateo, Mario; Bailey, John I. III [Department of Astronomy, University of Michigan, Ann Arbor, MI 48109 (United States); Crane, Jeffrey D., E-mail:, E-mail:, E-mail:, E-mail:, E-mail:, E-mail:, E-mail: [The Observatories of the Carnegie Institution for Science, Pasadena, CA 91101 (United States)


    A combined effort utilizing spectroscopy and photometry has revealed the existence of a new globular cluster class. These “anomalous” clusters, which we refer to as “iron-complex” clusters, are differentiated from normal clusters by exhibiting large (≳0.10 dex) intrinsic metallicity dispersions, complex sub-giant branches, and correlated [Fe/H] and s-process enhancements. In order to further investigate this phenomenon, we have measured radial velocities and chemical abundances for red giant branch stars in the massive, but scarcely studied, globular cluster NGC 6273. The velocities and abundances were determined using high resolution (R ∼ 27,000) spectra obtained with the Michigan/Magellan Fiber System (M2FS) and MSpec spectrograph on the Magellan–Clay 6.5 m telescope at Las Campanas Observatory. We find that NGC 6273 has an average heliocentric radial velocity of +144.49 km s{sup −1} (σ = 9.64 km s{sup −1}) and an extended metallicity distribution ([Fe/H] = −1.80 to −1.30) composed of at least two distinct stellar populations. Although the two dominant populations have similar [Na/Fe], [Al/Fe], and [α/Fe] abundance patterns, the more metal-rich stars exhibit significant [La/Fe] enhancements. The [La/Eu] data indicate that the increase in [La/Fe] is due to almost pure s-process enrichment. A third more metal-rich population with low [X/Fe] ratios may also be present. Therefore, NGC 6273 joins clusters such as ω Centauri, M2, M22, and NGC 5286 as a new class of iron-complex clusters exhibiting complicated star formation histories.

  10. Phenotypes of asthma in low-income children and adolescents: cluster analysis

    Directory of Open Access Journals (Sweden)

    Anna Lucia Barros Cabral

    Full Text Available ABSTRACT Objective: Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. Methods: We included 306 children and adolescents (6-18 years of age with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Results: Three clusters were defined for our population. Cluster 1 (n = 94 included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87 included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108 included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Conclusions: Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability.

  11. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice. (United States)

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas


    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database ( with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

  12. Uranium enrichment

    International Nuclear Information System (INIS)


    This report looks at the following issues: How much Soviet uranium ore and enriched uranium are imported into the United States and what is the extent to which utilities flag swap to disguise these purchases? What are the U.S.S.R.'s enriched uranium trading practices? To what extent are utilities required to return used fuel to the Soviet Union as part of the enriched uranium sales agreement? Why have U.S. utilities ended their contracts to buy enrichment services from DOE?

  13. Immune-related genetic enrichment in frontotemporal dementia: An analysis of genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Iris Broce


    Full Text Available Converging evidence suggests that immune-mediated dysfunction plays an important role in the pathogenesis of frontotemporal dementia (FTD. Although genetic studies have shown that immune-associated loci are associated with increased FTD risk, a systematic investigation of genetic overlap between immune-mediated diseases and the spectrum of FTD-related disorders has not been performed.Using large genome-wide association studies (GWASs (total n = 192,886 cases and controls and recently developed tools to quantify genetic overlap/pleiotropy, we systematically identified single nucleotide polymorphisms (SNPs jointly associated with FTD-related disorders-namely, FTD, corticobasal degeneration (CBD, progressive supranuclear palsy (PSP, and amyotrophic lateral sclerosis (ALS-and 1 or more immune-mediated diseases including Crohn disease, ulcerative colitis (UC, rheumatoid arthritis (RA, type 1 diabetes (T1D, celiac disease (CeD, and psoriasis. We found up to 270-fold genetic enrichment between FTD and RA, up to 160-fold genetic enrichment between FTD and UC, up to 180-fold genetic enrichment between FTD and T1D, and up to 175-fold genetic enrichment between FTD and CeD. In contrast, for CBD and PSP, only 1 of the 6 immune-mediated diseases produced genetic enrichment comparable to that seen for FTD, with up to 150-fold genetic enrichment between CBD and CeD and up to 180-fold enrichment between PSP and RA. Further, we found minimal enrichment between ALS and the immune-mediated diseases tested, with the highest levels of enrichment between ALS and RA (up to 20-fold. For FTD, at a conjunction false discovery rate < 0.05 and after excluding SNPs in linkage disequilibrium, we found that 8 of the 15 identified loci mapped to the human leukocyte antigen (HLA region on Chromosome (Chr 6. We also found novel candidate FTD susceptibility loci within LRRK2 (leucine rich repeat kinase 2, TBKBP1 (TBK1 binding protein 1, and PGBD5 (piggyBac transposable element

  14. Isotope Enrichment Detection by Laser Ablation - Laser Absorption Spectrometry: Automated Environmental Sampling and Laser-Based Analysis for HEU Detection

    International Nuclear Information System (INIS)

    Anheier, Norman C.; Bushaw, Bruce A.


    The global expansion of nuclear power, and consequently the uranium enrichment industry, requires the development of new safeguards technology to mitigate proliferation risks. Current enrichment monitoring instruments exist that provide only yes/no detection of highly enriched uranium (HEU) production. More accurate accountancy measurements are typically restricted to gamma-ray and weight measurements taken in cylinder storage yards. Analysis of environmental and cylinder content samples have much higher effectiveness, but this approach requires onsite sampling, shipping, and time-consuming laboratory analysis and reporting. Given that large modern gaseous centrifuge enrichment plants (GCEPs) can quickly produce a significant quantity (SQ ) of HEU, these limitations in verification suggest the need for more timely detection of potential facility misuse. The Pacific Northwest National Laboratory (PNNL) is developing an unattended safeguards instrument concept, combining continuous aerosol particulate collection with uranium isotope assay, to provide timely analysis of enrichment levels within low enriched uranium facilities. This approach is based on laser vaporization of aerosol particulate samples, followed by wavelength tuned laser diode spectroscopy to characterize the uranium isotopic ratio through subtle differences in atomic absorption wavelengths. Environmental sampling (ES) media from an integrated aerosol collector is introduced into a small, reduced pressure chamber, where a focused pulsed laser vaporizes material from a 10 to 20-(micro)m diameter spot of the surface of the sampling media. The plume of ejected material begins as high-temperature plasma that yields ions and atoms, as well as molecules and molecular ions. We concentrate on the plume of atomic vapor that remains after the plasma has expanded and then cooled by the surrounding cover gas. Tunable diode lasers are directed through this plume and each isotope is detected by monitoring absorbance

  15. twzPEA: A Topology and Working Zone Based Pathway Enrichment Analysis Framework (United States)

    Sensitive detection of involvement and adaptation of key signaling, regulatory, and metabolic pathways holds the key to deciphering molecular mechanisms such as those in the biomass-to-biofuel conversion process in yeast. Typical gene set enrichment analyses often do not use topology information in...

  16. Genome editing using FACS enrichment of nuclease-expressing cells and indel detection by amplicon analysis

    DEFF Research Database (Denmark)

    Lonowski, Lindsey A; Narimatsu, Yoshiki; Riaz, Anjum


    , FACS enrichment of cells expressing nucleases linked to fluorescent proteins can be used to maximize knockout or knock-in editing efficiencies or to balance editing efficiency and toxic/off-target effects. The two methods can be combined to form a pipeline for cell-line editing that facilitates...

  17. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Rita Ismayilova


    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  18. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability (United States)

    Miller, Christopher B.; Bartlett, Delwyn J.; Mullins, Anna E.; Dodds, Kirsty L.; Gordon, Christopher J.; Kyle, Simon D.; Kim, Jong Won; D'Rozario, Angela L.; Lee, Rico S.C.; Comas, Maria; Marshall, Nathaniel S.; Yee, Brendon J.; Espie, Colin A.; Grunstein, Ronald R.


    Study Objectives: To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative (q)-EEG and heart rate variability (HRV). Methods: Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. Results: From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q-EEG. Clinical Trial Registration: Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: Citation: Miller CB, Bartlett DJ, Mullins AE, Dodds KL, Gordon CJ, Kyle SD, Kim JW, D'Rozario AL, Lee RS, Comas M, Marshall NS, Yee BJ, Espie CA, Grunstein RR. Clusters of Insomnia Disorder: an exploratory cluster analysis of objective sleep parameters reveals differences in neurocognitive functioning, quantitative EEG, and heart rate variability. SLEEP 2016;39(11):1993–2004. PMID:27568796

  19. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization.

    Directory of Open Access Journals (Sweden)

    Xiaoquan Wen


    Full Text Available We propose a novel statistical framework for integrating the result from molecular quantitative trait loci (QTL mapping into genome-wide genetic association analysis of complex traits, with the primary objectives of quantitatively assessing the enrichment of the molecular QTLs in complex trait-associated genetic variants and the colocalizations of the two types of association signals. We introduce a natural Bayesian hierarchical model that treats the latent association status of molecular QTLs as SNP-level annotations for candidate SNPs of complex traits. We detail a computational procedure to seamlessly perform enrichment, fine-mapping and colocalization analyses, which is a distinct feature compared to the existing colocalization analysis procedures in the literature. The proposed approach is computationally efficient and requires only summary-level statistics. We evaluate and demonstrate the proposed computational approach through extensive simulation studies and analyses of blood lipid data and the whole blood eQTL data from the GTEx project. In addition, a useful utility from our proposed method enables the computation of expected colocalization signals using simple characteristics of the association data. Using this utility, we further illustrate the importance of enrichment analysis on the ability to discover colocalized signals and the potential limitations of currently available molecular QTL data. The software pipeline that implements the proposed computation procedures, enloc, is freely available at

  20. Cluster analysis and ecology of living benthonic foraminiferids from inner shelf off Ratnagiri, West Coast, India

    Digital Repository Service at National Institute of Oceanography (India)

    Nigam, R.; Sarupria, J.S.

    Q-mode cluster analysis explains the spatial distribution data of living benthonic foraminiferids from the inner shelf off Ratnagiri. Two main biotopes and two sub-biotopes are revognised within the study area; biotope A, characterised by @i...

  1. Statistical Techniques Applied to Aerial Radiometric Surveys (STAARS): cluster analysis. National Uranium Resource Evaluation

    International Nuclear Information System (INIS)

    Pirkle, F.L.; Stablein, N.K.; Howell, J.A.; Wecksung, G.W.; Duran, B.S.


    One objective of the aerial radiometric surveys flown as part of the US Department of Energy's National Uranium Resource Evaluation (NURE) program was to ascertain the regional distribution of near-surface radioelement abundances. Some method for identifying groups of observations with similar radioelement values was therefore required. It is shown in this report that cluster analysis can identify such groups even when no a priori knowledge of the geology of an area exists. A method of convergent k-means cluster analysis coupled with a hierarchical cluster analysis is used to classify 6991 observations (three radiometric variables at each observation location) from the Precambrian rocks of the Copper Mountain, Wyoming, area. Another method, one that combines a principal components analysis with a convergent k-means analysis, is applied to the same data. These two methods are compared with a convergent k-means analysis that utilizes available geologic knowledge. All three methods identify four clusters. Three of the clusters represent background values for the Precambrian rocks of the area, and one represents outliers (anomalously high 214 Bi). A segmentation of the data corresponding to geologic reality as discovered by other methods has been achieved based solely on analysis of aerial radiometric data. The techniques employed are composites of classical clustering methods designed to handle the special problems presented by large data sets. 20 figures, 7 tables

  2. Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033 (United States)

    Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.


    Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.

  3. Subtypes of autism by cluster analysis based on structural MRI data. (United States)

    Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas


    The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

  4. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo


    We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....

  5. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo


    An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....

  6. Clustering applications in financial and economic analysis of the crop production in the Russian regions

    Directory of Open Access Journals (Sweden)

    Gromov Vladislav Vladimirovich


    Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.

  7. Publication Bias Currently Makes an Accurate Estimate of the Benefits of Enrichment Programs Difficult: A Postmortem of Two Meta-Analyses Using Statistical Power Analysis (United States)

    Warne, Russell T.


    Recently Kim (2016) published a meta-analysis on the effects of enrichment programs for gifted students. She found that these programs produced substantial effects for academic achievement (g = 0.96) and socioemotional outcomes (g = 0.55). However, given current theory and empirical research these estimates of the benefits of enrichment programs…

  8. Analysis of simple sequence repeats in rice bean (Vigna umbellata using an SSR-enriched library

    Directory of Open Access Journals (Sweden)

    Lixia Wang


    Full Text Available Rice bean (Vigna umbellata Thunb., a warm-season annual legume, is grown in Asia mainly for dried grain or fodder and plays an important role in human and animal nutrition because the grains are rich in protein and some essential fatty acids and minerals. With the aim of expediting the genetic improvement of rice bean, we initiated a project to develop genomic resources and tools for molecular breeding in this little-known but important crop. Here we report the construction of an SSR-enriched genomic library from DNA extracted from pooled young leaf tissues of 22 rice bean genotypes and developing SSR markers. In 433,562 reads generated by a Roche 454 GS-FLX sequencer, we identified 261,458 SSRs, of which 48.8% were of compound form. Dinucleotide repeats were predominant with an absolute proportion of 81.6%, followed by trinucleotides (17.8%. Other types together accounted for 0.6%. The motif AC/GT accounted for 77.7% of the total, followed by AAG/CTT (14.3%, and all others accounted for 12.0%. Among the flanking sequences, 2928 matched putative genes or gene models in the protein database of Arabidopsis thaliana, corresponding with 608 non-redundant Gene Ontology terms. Of these sequences, 11.2% were involved in cellular components, 24.2% were involved molecular functions, and 64.6% were associated with biological processes. Based on homolog analysis, 1595 flanking sequences were similar to mung bean and 500 to common bean genomic sequences. Comparative mapping was conducted using 350 sequences homologous to both mung bean and common bean sequences. Finally, a set of primer pairs were designed, and a validation test showed that 58 of 220 new primers can be used in rice bean and 53 can be transferred to mung bean. However, only 11 were polymorphic when tested on 32 rice bean varieties. We propose that this study lays the groundwork for developing novel SSR markers and will enhance the mapping of qualitative and quantitative traits and marker

  9. Uranium enrichment

    International Nuclear Information System (INIS)


    GAO was asked to address several questions concerning a number of proposed uranium enrichment bills introduced during the 100th Congress. The bill would have restructured the Department of Energy's uranium enrichment program as a government corporation to allow it to compete more effectively in the domestic and international markets. Some of GAO's findings discussed are: uranium market experts believe and existing market models show that the proposed DOE purchase of a $750 million of uranium from domestic producers may not significantly increase production because of large producer-held inventories; excess uranium enrichment production capacity exists throughout the world; therefore, foreign producers are expected to compete heavily in the United States throughout the 1990s as utilities' contracts with DOE expire; and according to a 1988 agreement between DOE's Offices of Nuclear Energy and Defense Programs, enrichment decommissioning costs, estimated to total $3.6 billion for planning purposes, will be shared by the commercial enrichment program and the government

  10. FLOCK cluster analysis of mast cell event clustering by high-sensitivity flow cytometry predicts systemic mastocytosis. (United States)

    Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty


    In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.

  11. RNA-Seq for enrichment and analysis of IRF5 transcript expression in SLE.

    Directory of Open Access Journals (Sweden)

    Rivka C Stone

    Full Text Available Polymorphisms in the interferon regulatory factor 5 (IRF5 gene have been consistently replicated and shown to confer risk for or protection from the development of systemic lupus erythematosus (SLE. IRF5 expression is significantly upregulated in SLE patients and upregulation associates with IRF5-SLE risk haplotypes. IRF5 alternative splicing has also been shown to be elevated in SLE patients. Given that human IRF5 exists as multiple alternatively spliced transcripts with distinct function(s, it is important to determine whether the IRF5 transcript profile expressed in healthy donor immune cells is different from that expressed in SLE patients. Moreover, it is not currently known whether an IRF5-SLE risk haplotype defines the profile of IRF5 transcripts expressed. Using standard molecular cloning techniques, we identified and isolated 14 new differentially spliced IRF5 transcript variants from purified monocytes of healthy donors and SLE patients to generate an IRF5 variant transcriptome. Next-generation sequencing was then used to perform in-depth and quantitative analysis of full-length IRF5 transcript expression in primary immune cells of SLE patients and healthy donors by next-generation sequencing. Evidence for additional alternatively spliced transcripts was obtained from de novo junction discovery. Data from these studies support the overall complexity of IRF5 alternative splicing in SLE. Results from next-generation sequencing correlated with cloning and gave similar abundance rankings in SLE patients thus supporting the use of this new technology for in-depth single gene transcript profiling. Results from this study provide the first proof that 1 SLE patients express an IRF5 transcript signature that is distinct from healthy donors, 2 an IRF5-SLE risk haplotype defines the top four most abundant IRF5 transcripts expressed in SLE patients, and 3 an IRF5 transcript signature enables clustering of SLE patients with the H2 risk haplotype.

  12. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    International Nuclear Information System (INIS)

    Harmon, S; Wendelberger, B; Jeraj, R


    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [ 18 F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI mean = 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI range : 0.2301–1). Conclusion: Using commonly-used clustering algorithms, we found

  13. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Harmon, S; Wendelberger, B [University of Wisconsin-Madison, Madison, WI (United States); Jeraj, R [University of Wisconsin-Madison, Madison, WI (United States); University of Ljubljana (Slovenia)


    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  14. Fusion-fission hybrid design with analysis of direct enrichment and non-proliferation features (the SOLASE-H study)

    International Nuclear Information System (INIS)

    Conn, R.W.; Abdel-Khalik, S.I.; Moses, G.A.; Kulcinski, G.L.; Larsen, E.; Maynard, C.W.; Magheb, M.M.H.; Sviatolslavsky, I.N.; Vogelsang, W.F.; Wolfer, W.G.


    The role of a fusion-fission hybrid in the context of a nuclear economy with and without reprocessing is examined. An inertial confinement fusion driver is assumed and a consistent set of reactor parameters are developed. The form of the driver is not critical, however, to the general concepts. The use of the hybrid as a fuel factory within a secured fuel production and reprocessing center is considered. Either the hybrid or a low power fission reactor can be used to mildly irradiate fuel prior to shipment to offsite reactors thereby rendering the fuel resistant to diversion. A simplified economic analysis indicates a hybrid providing fuel to 10 fission reactors of equal thermal power is insensitive to the recirculating power fraction provided reprocessing is permitted. If reprocessing is not allowed, the hybrid can be used to directly enrich light water reactor fuel bundles fabricated initially from fertile fuel (either ThO 2 or 238 UO 2 ). A detailed neutronic analysis indicates such direct enrichments is feasible but the support ratio for 233 U or 239 Pu production is only 2, making such an approach highly sensitive to the hybrid cost. The hybrid would have to produce considerable net power for economic feasibility in this case. Inertial confinement fusion performance requirements for hybrid application are also examined and an integrated design, SOLASE-H, is described based upon the direct enrichment concept. (orig.)

  15. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd


    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  16. Selective enrichment of sialic acid-containing glycopeptides using titanium dioxide chromatography with analysis by HILIC and mass spectrometry

    DEFF Research Database (Denmark)

    Palmisano, Giuseppe; Lendal, Sara Eun; Engholm-Keller, Kasper


    -containing glycopeptides is achieved by using a low-pH buffer that contains a substituted acid such as glycolic acid to improve the binding efficiency and selectivity of SA-containing glycopeptides to the TiO(2) resin. By combining TiO(2) enrichment of sialylated glycopeptides with HILIC separation of deglycosylated...... of glycosylation sites and the characterization of glycan structures. In this paper, we describe a protocol for the selective enrichment of SA-containing glycopeptides using a combination of titanium dioxide (TiO(2)) and hydrophilic interaction liquid chromatography (HILIC). The selectivity of TiO(2) toward SA...... peptides, a more comprehensive analysis of formerly sialylated glycopeptides by MS can be achieved. Here we illustrate the efficiency of the method by the identification of 1,632 unique formerly sialylated glycopeptides from 817 sialylated glycoproteins. The TiO(2)/HILIC protocol requires 2 d...

  17. Analysis of enriched HF-UF6 systems. Influence by impurity and density upon the value of the multiplication

    International Nuclear Information System (INIS)

    Acosta, N.B.; Canavese, S.I.; Lopez, M.L.


    The purpose of this paper is analyzing the influence of impurity in hydrogen fluoride and in density variation (UF 6 -HF) upon the value of the effective multiplication factor (Kef) in enriched uranium hexafluoride and hydrogen fluoride systems. The identification of the values of such multiplication factors were performed by means of the Monte-Carlo (MONK V.II) code, which is specific for criticality problems. Diverse systems were considered by keeping the same geometry and varying the density value and the impurity percentages, while the assumptions made for each model were described on a case-by-case basis. Also, systems with and without water infinite reflector were evaluated. Finally, an analysis is made of the influence of each parameter upon the effective multiplication factor, in the postulated enriched UF 6 -HF systems. (Author) [es

  18. Analysis of 235U enrichment by chemical exchange in U(IV) - U(VI) system on anionite

    International Nuclear Information System (INIS)

    Raica, Paula; Axente, Damian


    Full text: A theoretical study about the 235 U enrichment by chemical exchange method in U(IV)-U(VI) system on anion-exchange resins is presented. The 235 U isotope concentration profiles along the band were numerically calculated using an accurate mathematical model and simulations were carried out for the situation of product and waste withdrawal and feed supply. By means of numerical simulation, an estimation of the migration time, necessary for a desired enrichment degree, was obtained. The required migration distance, the production of uranium 3 at.% 235 U per year and the plant configuration are calculated for different operating conditions. An analysis of the process scale for various experimental conditions is also presented. (authors)

  19. The dynamics of cyclone clustering in re-analysis and a high-resolution climate model (United States)

    Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len


    Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.

  20. Analysis of organizational options for the uranium enrichment enterprise in relation to asset divesture

    International Nuclear Information System (INIS)

    Harrer, B.J.; Hattrup, M.P.; Dase, J.E.; Nicholls, A.K.


    This report presents a comparison of the characteristics of some prominent examples of independent government corporations and agencies with respect to the Department of Energy's (DOE) uranium enrichment enterprise. The six examples studied were: the Bonneville Power Administration (BPA); the Tennessee Valley Authority (TVA); the Synthetic Fuels Corporation (SYNFUELS); the Consolidated Rail Corporation (CONRAIL); the British Telecommunications Corporation (British TELECOM); and the Communications Satellite Organization (COMSAT), in order of decreasing levels of government ownership and control. They range from BPA, which is organized as an agency within DOE, to COMSAT, which is privately owned and free from almost all regulations common to government agencies. Differences in the degree of government involvement in these corporations and in many other characteristics serve to illustrate that there are no accepted standards for defining the characteristics of government corporations. Thus, historical precedent indicates considerable flexibility would be available in the development of enabling legislation to reorganize the enrichment enterprise as a government corporation or independent government agency

  1. Study of a system for tritium analysis in water by electrolytic enrichment and liquid scintillation

    International Nuclear Information System (INIS)

    Pane, L.


    A system for the measurement of the low-level tritium concentrations in water samples has been experimentally studied. The enrichment of the samples is performed through electrolysis in twenty cells connected in series, and the counting is made in a liquid scintillation counter. Several parameters that could affect the accuracy of the results are analysed and the optimization of the system is discussed. For a sample volume reduction from 1000 to 15ml, the recovery of tritium, during electrolysis is of 63% and the enrichment factor is about 40. The lowest detection limit of the system is 1.0+-0.5 U.T. Its analytical capacity is of 30 samples a month. The results obtained in the determination of 3 H concentration in a series of samples from rain, surface and underground waters can be considered satisfactory. (Author) [pt

  2. Positron Emission Tomography Particle tracking using cluster analysis

    International Nuclear Information System (INIS)

    Gundogdu, O.


    Positron Emission Particle Tracking was successfully used in a wide range of industrial applications. This technique primarily uses a single positron emitting tracer particle. However, using multiple particles would provide more comparative information about the physical processes taking place in a system such as mixing or fluidised beds. In this paper, a unique method that enables us to track more than one particle is presented. This method is based on the midpoint of the closest distance between two trajectories or coincidence vectors. The technique presented in this paper employs a clustering method

  3. Positron Emission Tomography Particle tracking using cluster analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gundogdu, O. [University of Birmingham, School of Physics and Astronomy, Birmingham, B15 2TT (United Kingdom)]. E-mail:


    Positron Emission Particle Tracking was successfully used in a wide range of industrial applications. This technique primarily uses a single positron emitting tracer particle. However, using multiple particles would provide more comparative information about the physical processes taking place in a system such as mixing or fluidised beds. In this paper, a unique method that enables us to track more than one particle is presented. This method is based on the midpoint of the closest distance between two trajectories or coincidence vectors. The technique presented in this paper employs a clustering method.

  4. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua


    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  5. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan


    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  6. Uranium enrichment

    International Nuclear Information System (INIS)

    Rae, H.K.; Melvin, J.G.


    Canada is the world's largest producer and exporter of uranium, most of which is enriched elsewhere for use as fuel in LWRs. The feasibility of a Canadian uranium-enrichment enterprise is therefore a perennial question. Recent developments in uranium-enrichment technology, and their likely impacts on separative work supply and demand, suggest an opportunity window for Canadian entry into this international market. The Canadian opportunity results from three particular impacts of the new technologies: 1) the bulk of the world's uranium-enrichment capacity is in gaseous diffusion plants which, because of their large requirements for electricity (more than 2000 kW·h per SWU), are vulnerable to competition from the new processes; 2) the decline in enrichment costs increases the economic incentive for the use of slightly-enriched uranium (SEU) fuel in CANDU reactors, thus creating a potential Canadian market; and 3) the new processes allow economic operation on a much smaller scale, which drastically reduces the investment required for market entry and is comparable with the potential Canadian SEU requirement. The opportunity is not open-ended. By the end of the century the enrichment supply industry will have adapted to the new processes and long-term customer/supplier relationships will have been established. In order to seize the opportunity, Canada must become a credible supplier during this century

  7. Pulse shape analysis of enriched BEGe detectors in vacuum cryostat and liquid argon

    Energy Technology Data Exchange (ETDEWEB)

    Wagner, Victoria [Max-Planck-Institut fuer Kernphysik, Heidelberg (Germany); Collaboration: GERDA-Collaboration


    The Gerda experiment searches for the lepton number violating neutrinoless double beta (0νββ) decay of {sup 76}Ge. Germanium diodes of BEGe type (Canberra, Belgium) made from isotopically modified material have been procured for Phase II of Gerda. They will improve the sensitivity of the experiment by additional target mass, improved energy resolution and enhanced pulse shape discrimination (PSD) against background events. The PSD efficiencies of the new enriched BEGe detectors were studied in vacuum cryostats as part of the characterization campaign at the HADES underground laboratory. For a deeper understanding of the pulse shape performance of the enriched BEGe detectors, detailed {sup 241}Am surface scans were performed. Unexpectedly high position-dependence of the pulse shape parameter Amplitude-over-Energy was found for some of the detectors. With further investigation this effect was traced to surface charge effects specific to the operational configuration of the detectors inside the vacuum cryostats. The standard behavior is restored when they are operated in liquid argon in the configuration intended for Gerda Phase II. Finally, five of the enriched BEGe diodes were installed in the Gerda liquid argon cryostat prior to the full upgrade. They show a good performance and are able to reject efficiently multi-site-events as well as β- and α-particles.

  8. Multispectral Imaging Analysis of Circulating Tumor Cells in Negatively Enriched Peripheral Blood Samples. (United States)

    Miller, Brandon; Lustberg, Maryam; Summers, Thomas A; Chalmers, Jeffrey J


    A variety of biomarkers are present on cells in peripheral blood of patients with a variety of disorders, including solid tumor malignancies. While rare, characterization of these cells for specific protein levels with the advanced technology proposed, will lead to future validation studies of blood samples as "liquid biopsies" for the evaluation of disease status and therapeutic response. While circulating tumor cells (CTCs) have been isolated in the blood samples of patients with solid tumors, the exact role of CTCs as clinically useful predictive markers is still debated. Current commercial technology has significant bias in that a positive selection technology is used that preassumes specific cell surface markers (such as EpCAM) are present on CTCs. However, CTCs with low EpCAM expression have been experimentally demonstrated to be more likely to be missed by this method. In contrast, this application uses a previously developed, technology that performs a purely negative enrichment methodology on peripheral blood, yielding highly enriched blood samples that contain CTCs as well as other, undefined cell types. The focus of this contribution is the use of multispectral imaging of epifluorescent, microscopic images of these enriched cells in order to help develop clinically relevant liquid biopsies from peripheral blood samples.

  9. Golgi enrichment and proteomic analysis of developing Pinus radiata xylem by free-flow electrophoresis.

    Directory of Open Access Journals (Sweden)

    Harriet T Parsons

    Full Text Available Our understanding of the contribution of Golgi proteins to cell wall and wood formation in any woody plant species is limited. Currently, little Golgi proteomics data exists for wood-forming tissues. In this study, we attempted to address this issue by generating and analyzing Golgi-enriched membrane preparations from developing xylem of compression wood from the conifer Pinus radiata. Developing xylem samples from 3-year-old pine trees were harvested for this purpose at a time of active growth and subjected to a combination of density centrifugation followed by free flow electrophoresis, a surface charge separation technique used in the enrichment of Golgi membranes. This combination of techniques was successful in achieving an approximately 200-fold increase in the activity of the Golgi marker galactan synthase and represents a significant improvement for proteomic analyses of the Golgi from conifers. A total of thirty known Golgi proteins were identified by mass spectrometry including glycosyltransferases from gene families involved in glucomannan and glucuronoxylan biosynthesis. The free flow electrophoresis fractions of enriched Golgi were highly abundant in structural proteins (actin and tubulin indicating a role for the cytoskeleton during compression wood formation. The mass spectrometry proteomics data associated with this study have been deposited to the ProteomeXchange with identifier PXD000557.

  10. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

    Directory of Open Access Journals (Sweden)

    Chao Zhang


    Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  11. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis. (United States)

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan


    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  12. Point Cluster Analysis Using a 3D Voronoi Diagram with Applications in Point Cloud Segmentation

    Directory of Open Access Journals (Sweden)

    Shen Ying


    Full Text Available Three-dimensional (3D point analysis and visualization is one of the most effective methods of point cluster detection and segmentation in geospatial datasets. However, serious scattering and clotting characteristics interfere with the visual detection of 3D point clusters. To overcome this problem, this study proposes the use of 3D Voronoi diagrams to analyze and visualize 3D points instead of the original data item. The proposed algorithm computes the cluster of 3D points by applying a set of 3D Voronoi cells to describe and quantify 3D points. The decompositions of point cloud of 3D models are guided by the 3D Voronoi cell parameters. The parameter values are mapped from the Voronoi cells to 3D points to show the spatial pattern and relationships; thus, a 3D point cluster pattern can be highlighted and easily recognized. To capture different cluster patterns, continuous progressive clusters and segmentations are tested. The 3D spatial relationship is shown to facilitate cluster detection. Furthermore, the generated segmentations of real 3D data cases are exploited to demonstrate the feasibility of our approach in detecting different spatial clusters for continuous point cloud segmentation.

  13. Detection of secondary structure elements in proteins by hydrophobic cluster analysis. (United States)

    Woodcock, S; Mornon, J P; Henrissat, B


    Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.

  14. Uranium enrichment

    International Nuclear Information System (INIS)

    Mohrhauer, H.


    The separation of uranium isotopes in order to enrich the fuel for light water reactors with the light isotope U-235 is an important part of the nuclear fuel cycle. After the basic principals of isotope separation the gaseous diffusion and the centrifuge process are explained. Both these techniques are employed on an industrial scale. In addition a short review is given on other enrichment techniques which have been demonstrated at least on a laboratory scale. After some remarks on the present situation on the enrichment market the progress in the development and the industrial exploitation of the gas centrifuge process by the trinational Urenco-Centec organisation is presented. (orig.)

  15. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province (United States)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue


    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  16. Using the Cluster Analysis and the Principal Component Analysis in Evaluating the Quality of a Destination

    Directory of Open Access Journals (Sweden)

    Ida Vajčnerová


    Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.

  17. The CERN analysis facility-a PROOF cluster for day-one physics analysis

    International Nuclear Information System (INIS)

    Grosse-Oetringhaus, J F


    ALICE (A Large Ion Collider Experiment) at the LHC plans to use a PROOF cluster at CERN (CAF - CERN Analysis Facility) for analysis. The system is especially aimed at the prototyping phase of analyses that need a high number of development iterations and thus require a short response time. Typical examples are the tuning of cuts during the development of an analysis as well as calibration and alignment. Furthermore, the use of an interactive system with very fast response will allow ALICE to extract physics observables out of first data quickly. An additional use case is fast event simulation and reconstruction. A test setup consisting of 40 machines is used for evaluation since May 2006. The PROOF system enables the parallel processing and xrootd the access to files distributed on the test cluster. An automatic staging system for files either catalogued in the ALICE file catalog or stored in the CASTOR mass storage system has been developed. The current setup and ongoing development towards disk quotas and CPU fairshare are described. Furthermore, the integration of PROOF into ALICE's software framework (AliRoot) is discussed

  18. Clustering-based analysis for residential district heating data

    DEFF Research Database (Denmark)

    Gianniou, Panagiota; Liu, Xiufeng; Heller, Alfred


    The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze r....... These findings will be valuable for district heating utilities and energy planners to optimize their operations, design demand-side management strategies, and develop targeting energy-efficiency programs or policies.......The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze...... residential heating consumption data and evaluate information included in national building databases. The proposed method uses the K-means algorithm to segment consumption groups based on consumption intensity and representative patterns and ranks the groups according to daily consumption. This paper also...

  19. Cluster cosmological analysis with X ray instrumental observables: introduction and testing of AsPIX method

    International Nuclear Information System (INIS)

    Valotti, Andrea


    Cosmology is one of the fundamental pillars of astrophysics, as such it contains many unsolved puzzles. To investigate some of those puzzles, we analyze X-ray surveys of galaxy clusters. These surveys are possible thanks to the bremsstrahlung emission of the intra-cluster medium. The simultaneous fit of cluster counts as a function of mass and distance provides an independent measure of cosmological parameters such as Ω m , σ s , and the dark energy equation of state w0. A novel approach to cosmological analysis using galaxy cluster data, called top-down, was developed in N. Clerc et al. (2012). This top-down approach is based purely on instrumental observables that are considered in a two-dimensional X-ray color-magnitude diagram. The method self-consistently includes selection effects and scaling relationships. It also provides a means of bypassing the computation of individual cluster masses. My work presents an extension of the top-down method by introducing the apparent size of the cluster, creating a three-dimensional X-ray cluster diagram. The size of a cluster is sensitive to both the cluster mass and its angular diameter, so it must also be included in the assessment of selection effects. The performance of this new method is investigated using a Fisher analysis. In parallel, I have studied the effects of the intrinsic scatter in the cluster size scaling relation on the sample selection as well as on the obtained cosmological parameters. To validate the method, I estimate uncertainties of cosmological parameters with MCMC method Amoeba minimization routine and using two simulated XMM surveys that have an increasing level of complexity. The first simulated survey is a set of toy catalogues of 100 and 10000 deg 2 , whereas the second is a 1000 deg 2 catalogue that was generated using an Aardvark semi-analytical N-body simulation. This comparison corroborates the conclusions of the Fisher analysis. In conclusion, I find that a cluster diagram that accounts

  20. Behavioral Health Risk Profiles of Undergraduate University Students in England, Wales, and Northern Ireland: A Cluster Analysis. (United States)

    El Ansari, Walid; Ssewanyana, Derrick; Stock, Christiane


    Limited research has explored clustering of lifestyle behavioral risk factors (BRFs) among university students. This study aimed to explore clustering of BRFs, composition of clusters, and the association of the clusters with self-rated health and perceived academic performance. We assessed (BRFs), namely tobacco smoking, physical inactivity, alcohol consumption, illicit drug use, unhealthy nutrition, and inadequate sleep, using a self-administered general Student Health Survey among 3,706 undergraduates at seven UK universities. A two-step cluster analysis generated: Cluster 1 (the high physically active and health conscious) with very high health awareness/consciousness, good nutrition, and physical activity (PA), and relatively low alcohol, tobacco, and other drug (ATOD) use. Cluster 2 (the abstinent) had very low ATOD use, high health awareness, good nutrition, and medium high PA. Cluster 3 (the moderately health conscious) included the highest regard for healthy eating, second highest fruit/vegetable consumption, and moderately high ATOD use. Cluster 4 (the risk taking) showed the highest ATOD use, were the least health conscious, least fruit consuming, and attached the least importance on eating healthy. Compared to the healthy cluster (Cluster 1), students in other clusters had lower self-rated health, and particularly, students in the risk taking cluster (Cluster 4) reported lower academic performance. These associations were stronger for men than for women. Of the four clusters, Cluster 4 had the youngest students. Our results suggested that prevention among university students should address multiple BRFs simultaneously, with particular focus on the younger students.

  1. Application of Cluster Analysis in Assessment of Dietary Habits of Secondary School Students

    Directory of Open Access Journals (Sweden)

    Zalewska Magdalena


    Full Text Available Maintenance of proper health and prevention of diseases of civilization are now significant public health problems. Nutrition is an important factor in the development of youth, as well as the current and future state of health. The aim of the study was to show the benefits of the application of cluster analysis to assess the dietary habits of high school students. The survey was carried out on 1,631 eighteen-year-old students in seven randomly selected secondary schools in Bialystok using a self-prepared anonymous questionnaire. An evaluation of the time of day meals were eaten and the number of meals consumed was made for the surveyed students. The cluster analysis allowed distinguishing characteristic structures of dietary habits in the observed population. Four clusters were identified, which were characterized by relative internal homogeneity and substantial variation in terms of the number of meals during the day and the time of their consumption. The most important characteristics of cluster 1 were cumulated food ration in 2 or 3 meals and long intervals between meals. Cluster 2 was characterized by eating the recommended number of 4 or 5 meals a day. In the 3rd cluster, students ate 3 meals a day with large intervals between them, and in the 4th they had four meals a day while maintaining proper intervals between them. In all clusters dietary mistakes occurred, but most of them were related to clusters 1 and 3. Cluster analysis allowed for the identification of major flaws in nutrition, which may include irregular eating and skipping meals, and indicated possible connections between eating patterns and disturbances of body weight in the examined population.

  2. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient. (United States)

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J


    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson

  3. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    Directory of Open Access Journals (Sweden)

    Bendahmane Abdelhafid


    Full Text Available Abstract Background Melon (Cucumis melo, an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs and 3,073 single nucleotide polymorphisms (SNPs in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but

  4. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti


    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  5. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence (United States)

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta


    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  6. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis. (United States)

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne


    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  7. Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics. (United States)

    Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J


    Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the

  8. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures. (United States)

    Li, Zhaonan; Xu, Xinyi; Shen, Junshan


    In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.

  9. A formal concept analysis approach to consensus clustering of multi-experiment expression data (United States)


    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological

  10. High-resolution abundance analysis of red giants in the globular cluster NGC 6522 (United States)

    Barbuy, B.; Chiappini, C.; Cantelli, E.; Depagne, E.; Pignatari, M.; Hirschi, R.; Cescutti, G.; Ortolani, S.; Hill, V.; Zoccali, M.; Minniti, D.; Trevisan, M.; Bica, E.; Gómez, A.


    Context. The [Sr/Ba] and [Y/Ba] scatter observed in some galactic halo stars that are very metal-poor and in a few individual stars of the oldest known Milky Way globular cluster NGC 6522 have been interpreted as evidence of early enrichment by massive fast-rotating stars (spinstars). Because NGC 6522 is a bulge globular cluster, the suggestion was that not only the very-metal poor halo stars, but also bulge stars at [Fe/H] ~ -1 could be used as probes of the stellar nucleosynthesis signatures from the earlier generations of massive stars, but at much higher metallicity. For the bulge the suggestions were based on early spectra available for stars in NGC 6522, with a medium resolution of R ~ 22 000 and a moderate signal-to-noise ratio. Aims: The main purpose of this study is to re-analyse the NGC 6522 stars reported previously by using new high-resolution (R ~ 45 000) and high signal-to-noise spectra (S/N > 100). We aim at re-deriving their stellar parameters and elemental ratios, in particular the abundances of the neutron-capture s-process-dominated elements such as Sr, Y, Zr, La, and Ba, and of the r-element Eu. Methods: High-resolution spectra of four giants belonging to the bulge globular cluster NGC 6522 were obtained at the 8m VLT UT2-Kueyen telescope with the UVES spectrograph in FLAMES-UVES configuration. The spectroscopic parameters were derived based on the excitation and ionization equilibrium of Fe i and Fe ii. Results: Our analysis confirms a metallicity [Fe/H] = -0.95 ± 0.15 for NGC 6522 and the overabundance of the studied stars in Eu (with +0.2 < [Eu/Fe] < + 0.4) and alpha-elements O and Mg. The neutron-capture s-element-dominated Sr, Y, Zr, Ba, and La now show less pronounced variations from star to star. Enhancements are in the range 0.0 < [Sr/Fe] < +0.4, +0.23 < [Y/Fe] < +0.43, 0.0 < [Zr/Fe] < +0.4, 0.0 < [La/Fe] < +0.35, and 0.05 < [Ba/Fe] < +0.55. Conclusions: The very high overabundances of [Y/Fe] previously reported for the four studied

  11. A critical cluster analysis of 44 indicators of author-level performance

    DEFF Research Database (Denmark)

    Wildgaard, Lorna Elizabeth


    -four indicators of individual researcher performance were computed using the data. The clustering solution was supported by continued reference to the researcher’s curriculum vitae, an effect analysis and a risk analysis. Disciplinary appropriate indicators were identified and used to divide the researchers......This paper explores a 7-stage cluster methodology as a process to identify appropriate indicators for evaluation of individual researchers at a disciplinary and seniority level. Publication and citation data for 741 researchers from 4 disciplines was collected in Web of Science. Forty...... of statistics in research evaluation. The strength of the 7-stage cluster methodology is that it makes clear that in the evaluation of individual researchers, statistics cannot stand alone. The methodology is reliant on contextual information to verify the bibliometric values and cluster solution...

  12. Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics

    Directory of Open Access Journals (Sweden)

    R. Padraic Springuel


    Full Text Available We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and written elements. The primary goal of this paper is to describe the methodology itself; we include a brief overview of relevant results.

  13. A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis


    Li, Guohui; Zhang, Songling; Yang, Hong


    Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...

  14. Statistical analysis of activation and reaction energies with quasi-variational coupled-cluster theory (United States)

    Black, Joshua A.; Knowles, Peter J.


    The performance of quasi-variational coupled-cluster (QV) theory applied to the calculation of activation and reaction energies has been investigated. A statistical analysis of results obtained for six different sets of reactions has been carried out, and the results have been compared to those from standard single-reference methods. In general, the QV methods lead to increased activation energies and larger absolute reaction energies compared to those obtained with traditional coupled-cluster theory.

  15. Theorical and experimental analysis of nitrogen-15 isotope enrichment by nitrogen monoxide and nitric acid system

    International Nuclear Information System (INIS)

    Ducatti, C.


    Nitrogen-15 isotope enrichment by chemical exchange in NO/HNO 3 system was studied using two different theories. The isotope fractionation factors obtained by the countercurrent theory was compared to those estimated by the isotope equipartition theory were confronted through a model. A column in countercurrent was built at laboratory scale and parameters such as: number of theoretical plates, height equivalent to a theoretical plate, type of packing, total height of column, production of H 15 NO 3 /week, obtained under isotope dynamic equilibrium conditions, were studied in comparison to those in the literature. (Author) [pt

  16. SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance. (United States)

    Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A


    Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.

  17. Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Directory of Open Access Journals (Sweden)

    Li Weizhong


    Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from

  18. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si


    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  19. Thermoresponsive Arrays Patterned via Photoclick Chemistry: Smart MALDI Plate for Protein Digest Enrichment, Desalting, and Direct MS Analysis. (United States)

    Meng, Xiao; Hu, Junjie; Chao, Zhicong; Liu, Ying; Ju, Huangxian; Cheng, Quan


    Sample desalting and concentration are crucial steps before matrix-assisted laser desorption/ionization-mass spectrometry (MALDI-MS) analysis. Current sample pretreatment approaches require tedious fabrication and operation procedures, which are unamenable to high-throughput analysis and also result in sample loss. Here, we report the development of a smart MALDI substrate for on-plate desalting, enrichment, and direct MS analysis of protein digests based on thermoresponsive, hydrophilic/hydrophobic transition of surface-grafted poly(N-isopropylacrylamide) (PNIPAM) microarrays. Superhydrophilic 1-thioglycerol microwells are first constructed on alkyne-silane-functionalized rough indium tin oxide substrates based on two sequential thiol-yne photoclick reactions, whereas the surrounding regions are modified with hydrophobic 1H,1H,2H,2H-perfluorodecanethiol. Surface-initiated atom-transfer radical polymerization is then triggered in microwells to form PNIPAM arrays, which facilitate sample loading and enrichment of protein digests by concentrating large-volume samples into small dots and achieving on-plate desalting through PNIPAM configuration change at elevated temperature. The smart MALDI plate shows high performance for mass spectrometric analysis of cytochrome c and neurotensin in the presence of 1 M urea and 100 mM NaHCO 3 , as well as improved detection sensitivity and high sequence coverage for α-casein and cytochrome c digests in femtomole range. The work presents a versatile sample pretreatment platform with great potential for proteomic research.

  20. Clustering analysis for muon tomography data elaboration in the Muon Portal project (United States)

    Bandieramonte, M.; Antonuccio-Delogu, V.; Becciani, U.; Costa, A.; La Rocca, P.; Massimino, P.; Petta, C.; Pistagna, C.; Riggi, F.; Riggi, S.; Sciacca, E.; Vitello, F.


    Clustering analysis is one of multivariate data analysis techniques which allows to gather statistical data units into groups, in order to minimize the logical distance within each group and to maximize the one between different groups. In these proceedings, the authors present a novel approach to the muontomography data analysis based on clustering algorithms. As a case study we present the Muon Portal project that aims to build and operate a dedicated particle detector for the inspection of harbor containers to hinder the smuggling of nuclear materials. Clustering techniques, working directly on scattering points, help to detect the presence of suspicious items inside the container, acting, as it will be shown, as a filter for a preliminary analysis of the data.

  1. Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis. (United States)

    Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai


    This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction. © 2014 Wiley Publishing Asia Pty Ltd.

  2. Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

    Directory of Open Access Journals (Sweden)

    M.F.M. Yunoh


    Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.

  3. Cluster analysis for the probability of DSB site induced by electron tracks

    Energy Technology Data Exchange (ETDEWEB)

    Yoshii, Y. [Biological Research, Education and Instrumentation Center, Sapporo Medical University, Sapporo 060-8556 (Japan); Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Sasaki, K. [Faculty of Health Sciences, Hokkaido University of Science, Sapporo 006-8585 (Japan); Matsuya, Y. [Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Date, H., E-mail: [Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan)


    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.


    Directory of Open Access Journals (Sweden)

    Sipos-Gug Sebastian


    Full Text Available Entrepreneurship is an active field of research, having known a major increase in interest and publication levels in the last years (Landström et al., 2012. Within this field recently there has been an increasing interest in understanding why some regions seem to have a significantly higher entrepreneurship activity compared to others. In line with this research field, we would like to investigate the differences in entrepreneurial activity among the Romanian counties (NUTS 3 regions. While the classical research paradigm in this field is to conduct a temporally stationary analysis, we choose to use a time series clustering analysis to better understanding the dynamics of entrepreneurial activity between counties. Our analysis showed that if we use the total number of new privately owned companies that are founded each year in the last decade (2002-2012 we can distinguish between 5 clusters, one with high total entrepreneurial activity (18 counties, one with above average activity (8 counties, two clusters with average and slightly below average activity (total of 18 counties and one cluster with low and declining activity (2 counties. If we are interested in the entrepreneurial activity rate, that is the number of new privately owned companies founded each year adjusted by the population of the respective county, we obtain 4 clusters, one with a very high entrepreneurial rate (1 county, one with average rate (10 counties, and two clusters with below average entrepreneurial rate (total of 31 counties. In conclusion, our research shows that Romania is far from being a homogeneous geographical area in respect to entrepreneurial activity. Depending on what we are interested in, it can be divided in 5 or 4 clusters of counties, which behave differently as a function of time. Further research should be focused on explaining these regional differences, on studying the high performance clusters and trying to improve the low performing ones.

  5. Selenium enrichment on Cordyceps militaris link and analysis on its main active components. (United States)

    Dong, Jing Z; Lei, C; Ai, Xun R; Wang, Y


    To investigate the effects of selenium on the main active components of Cordyceps militaris fruit bodies, selenium-enriched cultivation of C. militaris and the main active components of the fruit bodies were studied. Superoxide dismutase (SOD) activity and contents of cordycepin, cordycepic acid, and organic selenium of fruit bodies were sodium selenite concentration dependent; contents of adenosine and cordycep polysaccharides were significantly enhanced by adding sodium selenite in the substrates, but not proportional to sodium selenite concentrations. In the cultivation of wheat substrate added with 18.0 ppm sodium selenite, SOD activity and contents of cordycepin, cordycepic acid, adenosine, cordycep polysaccharides, and total amino acids were enhanced by 121/145%, 124/74%, 325/520%, 130/284%, 121/145%, and 157/554%, respectively, compared to NS (non-selenium-cultivated) fruit bodies and wild Cordyceps sinensis; organic selenium contents of fruit bodies reached 6.49 mg/100 g. So selenium-enriched cultivation may be a potential way to produce more valuable medicinal food as a substitute for wild C. sinensis.

  6. Preliminary analysis of 500 MWt MHD power plant with oxygen enrichment (United States)


    An MHD Engineering Test Facility design concept is analyzed. A 500 MWt oxygen enriched MHD topping cycle integrated for combined cycle operation with a 400 MWe steam plant is evaluated. The MHD cycle uses Montana Rosebud coal and air enriched to 35 mole percent oxygen preheated to 1100 F. The steam plant is a 2535 psia/1000 F/1000 F reheat recycle that was scaled down from the Gilbert/Commonwealth Reference Fossil Plant design series. Integration is accomplished by blending the steam generated in the MHD heat recovery system with steam generated by the partial firing of the steam plant boiler to provide the total flow requirement of the turbine. The major MHD and steam plant auxiliaries are driven by steam turbines. When the MHD cycle is taken out of service, the steam plant is capable of stand-alone operation at turbine design throttle flow. This operation requires the full firing of the steam plant boiler. A preliminary feasibility assessment is given, and results on the system thermodynamics, construction scheduling, and capital costs are presented.

  7. Genomic study and Medical Subject Headings enrichment analysis of early pregnancy rate and antral follicle numbers in Nelore heifers

    DEFF Research Database (Denmark)

    Oliveira Junior, G. A.; Perez, B. C.; Cole, J. B.


    be considered in a functional enrichment analysis to identify biological mechanisms involved in fertility. Medical Subject Headings (MeSH) were detected using the MESHR package, allowing the extraction of broad meanings from the gene lists provided by the GWAS. The estimated heritability for HP was 0.28 +/- 0...... gains. In this study, we performed a genomewide association study (GWAS) to identify genetic variants associated with reproductive traits in Nelore beef cattle. Heifer pregnancy (HP) was recorded for 1,267 genotyped animals distributed in 12 contemporary groups (CG) with an average pregnancy rate of 0...

  8. Ecosystem health pattern analysis of urban clusters based on emergy synthesis: Results and implication for management

    International Nuclear Information System (INIS)

    Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan


    The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis

  9. Concomitant formation of different nature clusters and hardening in reactor pressure vessel steels irradiated by heavy ions

    International Nuclear Information System (INIS)

    Fujii, K.; Fukuya, K.; Hojo, T.


    Specimens of A533B steels containing 0.04, 0.09 and 0.21 wt%Cu were irradiated at 290 °C to 3 dpa with 3 MeV Fe ions and subjected to atom probe analyses, transmission electron microscopy observations and hardness measurements. The atom probe analysis results showed that two types of solute clusters were formed: Cu-enriched clusters containing Mn, Ni and Si atoms as irradiation-enhanced solute atom clusters and Mn/Ni/Si-enriched clusters as irradiation-induced solute atom clusters. Both cluster types occurred in the highest Cu-content steel and the ratio of Mn/Ni/Si-enriched clusters to Cu-enriched clusters increased with irradiation doses. It was confirmed that the cluster formation was a key factor in the microstructure evolution until the high dose irradiation was reached even in the low Cu content steels though the dislocation loops with much lower density than that of the clusters were observed as matrix damage. The difference in the hardening efficiency due to the difference in the nature of the clusters was small. The irradiation-induced clustering of undersized Si atoms suggested that a clustering driving force other than vacancy-driven diffusion, probably an interstitial mechanism, may become important at higher dose rates

  10. Concomitant formation of different nature clusters and hardening in reactor pressure vessel steels irradiated by heavy ions

    Energy Technology Data Exchange (ETDEWEB)

    Fujii, K., E-mail: [Institute of Nuclear Safety System, Inc., Mihama 919-1205 (Japan); Fukuya, K. [Institute of Nuclear Safety System, Inc., Mihama 919-1205 (Japan); Hojo, T. [Japan Nuclear Energy Safety Organization, Toranomon, Minato-ku, Tokyo 105-0001 (Japan)


    Specimens of A533B steels containing 0.04, 0.09 and 0.21 wt%Cu were irradiated at 290 °C to 3 dpa with 3 MeV Fe ions and subjected to atom probe analyses, transmission electron microscopy observations and hardness measurements. The atom probe analysis results showed that two types of solute clusters were formed: Cu-enriched clusters containing Mn, Ni and Si atoms as irradiation-enhanced solute atom clusters and Mn/Ni/Si-enriched clusters as irradiation-induced solute atom clusters. Both cluster types occurred in the highest Cu-content steel and the ratio of Mn/Ni/Si-enriched clusters to Cu-enriched clusters increased with irradiation doses. It was confirmed that the cluster formation was a key factor in the microstructure evolution until the high dose irradiation was reached even in the low Cu content steels though the dislocation loops with much lower density than that of the clusters were observed as matrix damage. The difference in the hardening efficiency due to the difference in the nature of the clusters was small. The irradiation-induced clustering of undersized Si atoms suggested that a clustering driving force other than vacancy-driven diffusion, probably an interstitial mechanism, may become important at higher dose rates.

  11. Differentiating Procrastinators from Each Other: A Cluster Analysis. (United States)

    Rozental, Alexander; Forsell, Erik; Svensson, Andreas; Forsström, David; Andersson, Gerhard; Carlbring, Per


    Procrastination refers to the tendency to postpone the initiation and completion of a given course of action. Approximately one-fifth of the adult population and half of the student population perceive themselves as being severe and chronic procrastinators. Albeit not a psychiatric diagnosis, procrastination has been shown to be associated with increased stress and anxiety, exacerbation of illness, and poorer performance in school and work. However, despite being severely debilitating, little is known about the population of procrastinators in terms of possible subgroups, and previous research has mainly investigated procrastination among university students. The current study examined data from a screening process recruiting participants to a randomized controlled trial of Internet-based cognitive behavior therapy for procrastination (Rozental et al., in press). In total, 710 treatment-seeking individuals completed self-report measures of procrastination, depression, anxiety, and quality of life. The results suggest that there might exist five separate subgroups, or clusters, of procrastinators: "Mild procrastinators" (24.93%), "Average procrastinators" (27.89%), "Well-adjusted procrastinators" (13.94%), "Severe procrastinators" (21.69%), and "Primarily depressed" (11.55%). Hence, there seems to be marked differences among procrastinators in terms of levels of severity, as well as a possible subgroup for which procrastinatory problems are primarily related to depression. Tailoring the treatment interventions to the specific procrastination profile of the individual could thus become important, as well as screening for comorbid psychiatric diagnoses in order to target difficulties associated with, for instance, depression.

  12. On the blind use of statistical tools in the analysis of globular cluster stars (United States)

    D'Antona, Francesca; Caloi, Vittoria; Tailo, Marco


    As with most data analysis methods, the Bayesian method must be handled with care. We show that its application to determine stellar evolution parameters within globular clusters can lead to paradoxical results if used without the necessary precautions. This is a cautionary tale on the use of statistical tools for big data analysis.

  13. Standardized Effect Size Measures for Mediation Analysis in Cluster-Randomized Trials (United States)

    Stapleton, Laura M.; Pituch, Keenan A.; Dion, Eric


    This article presents 3 standardized effect size measures to use when sharing results of an analysis of mediation of treatment effects for cluster-randomized trials. The authors discuss 3 examples of mediation analysis (upper-level mediation, cross-level mediation, and cross-level mediation with a contextual effect) with demonstration of the…

  14. Cluster Analysis of Flow Cytometric List Mode Data on a Personal Computer

    NARCIS (Netherlands)

    Bakker Schut, Tom C.; Bakker schut, T.C.; de Grooth, B.G.; Greve, Jan


    A cluster analysis algorithm, dedicated to analysis of flow cytometric data is described. The algorithm is written in Pascal and implemented on an MS-DOS personal computer. It uses k-means, initialized with a large number of seed points, followed by a modified nearest neighbor technique to reduce

  15. Identification of Counterfeit Alcoholic Beverages Using Cluster Analysis in Principal-Component Space (United States)

    Khodasevich, M. A.; Sinitsyn, G. V.; Gres'ko, M. A.; Dolya, V. M.; Rogovaya, M. V.; Kazberuk, A. V.


    A study of 153 brands of commercial vodka products showed that counterfeit samples could be identified by introducing a unified additive at the minimum concentration acceptable for instrumental detection and multivariate analysis of UV-Vis transmission spectra. Counterfeit products were detected with 100% probability by using hierarchical cluster analysis or the C-means method in two-dimensional principal-component space.

  16. Functional Interference Clusters in Cancer Patients With Bone Metastases: A Secondary Analysis of RTOG 9714

    International Nuclear Information System (INIS)

    Chow, Edward; James, Jennifer; Barsevick, Andrea; Hartsell, William; Ratcliffe, Sarah; Scarantino, Charles; Ivker, Robert; Roach, Mack; Suh, John; Petersen, Ivy; Konski, Andre; Demas, William; Bruner, Deborah


    Purpose: To explore the relationships (clusters) among the functional interference items in the Brief Pain Inventory (BPI) in patients with bone metastases. Methods: Patients enrolled in the Radiation Therapy Oncology Group (RTOG) 9714 bone metastases study were eligible. Patients were assessed at baseline and 4, 8, and 12 weeks after randomization for the palliative radiotherapy with the BPI, which consists of seven functional items: general activity, mood, walking ability, normal work, relations with others, sleep, and enjoyment of life. Principal component analysis with varimax rotation was used to determine the clusters between the functional items at baseline and the follow-up. Cronbach's alpha was used to determine the consistency and reliability of each cluster at baseline and follow-up. Results: There were 448 male and 461 female patients, with a median age of 67 years. There were two functional interference clusters at baseline, which accounted for 71% of the total variance. The first cluster (physical interference) included normal work and walking ability, which accounted for 58% of the total variance. The second cluster (psychosocial interference) included relations with others and sleep, which accounted for 13% of the total variance. The Cronbach's alpha statistics were 0.83 and 0.80, respectively. The functional clusters changed at week 12 in responders but persisted through week 12 in nonresponders. Conclusion: Palliative radiotherapy is effective in reducing bone pain. Functional interference component clusters exist in patients treated for bone metastases. These clusters changed over time in this study, possibly attributable to treatment. Further research is needed to examine these effects.

  17. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling (United States)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma


    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  18. Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

    Directory of Open Access Journals (Sweden)

    Ma Jinhui


    Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.

  19. Sejong Open Cluster Survey (SOS). 0. Target Selection and Data Analysis (United States)

    Sung, Hwankyung; Lim, Beomdu; Bessell, Michael S.; Kim, Jinyoung S.; Hur, Hyeonoh; Chun, Moo-Young; Park, Byeong-Gon


    Star clusters are superb astrophysical laboratories containing cospatial and coeval samples of stars with similar chemical composition. We initiate the Sejong Open cluster Survey (SOS) - a project dedicated to providing homogeneous photometry of a large number of open clusters in the SAAO Johnson-Cousins' UBVI system. To achieve our main goal, we pay much attention to the observation of standard stars in order to reproduce the SAAO standard system. Many of our targets are relatively small sparse clusters that escaped previous observations. As clusters are considered building blocks of the Galactic disk, their physical properties such as the initial mass function, the pattern of mass segregation, etc. give valuable information on the formation and evolution of the Galactic disk. The spatial distribution of young open clusters will be used to revise the local spiral arm structure of the Galaxy. In addition, the homogeneous data can also be used to test stellar evolutionary theory, especially concerning rare massive stars. In this paper we present the target selection criteria, the observational strategy for accurate photometry, and the adopted calibrations for data analysis such as color-color relations, zero-age main sequence relations, Sp - M_V relations, Sp - T_{eff} relations, Sp - color relations, and T_{eff} - BC relations. Finally we provide some data analysis such as the determination of the reddening law, the membership selection criteria, and distance determination.

  20. Approximate fuzzy C-means (AFCM) cluster analysis of medical magnetic resonance image (MRI) data

    International Nuclear Information System (INIS)

    DelaPaz, R.L.; Chang, P.J.; Bernstein, R.; Dave, J.V.


    The authors describe the application of an approximate fuzzy C-means (AFCM) clustering algorithm as a data dimension reduction approach to medical magnetic resonance images (MRI). Image data consisted of one T1-weighted, two T2-weighted, and one T2*-weighted (magnetic susceptibility) image for each cranial study and a matrix of 10 images generated from 10 combinations of TE and TR for each body lymphoma study. All images were obtained with a 1.5 Tesla imaging system (GE Signa). Analyses were performed on over 100 MR image sets with a variety of pathologies. The cluster analysis was operated in an unsupervised mode and computational overhead was minimized by utilizing a table look-up approach without adversely affecting accuracy. Image data were first segmented into 2 coarse clusters, each of which was then subdivided into 16 fine clusters. The final tissue classifications were presented as color-coded anatomically-mapped images and as two and three dimensional displays of cluster center data in selected feature space (minimum spanning tree). Fuzzy cluster analysis appears to be a clinically useful dimension reduction technique which results in improved diagnostic specificity of medical magnetic resonance images

  1. Fault detection of flywheel system based on clustering and principal component analysis

    Directory of Open Access Journals (Sweden)

    Wang Rixin


    Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.

  2. A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior. (United States)

    Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia


    In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age  = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.

  3. Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Sen Zhang


    Full Text Available One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO, inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving clustering problems. In this study, first the PGWO algorithm is tested on seven benchmark functions. Second, the PGWO algorithm is used for data clustering on nine data sets. Compared to other state-of-the-art evolutionary algorithms, the results of benchmark and data clustering demonstrate the superior performance of PGWO algorithm.


    Energy Technology Data Exchange (ETDEWEB)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew [The Observatories of the Carnegie Institution for Science, 813 Santa Barbara St., Pasadena, CA 91101 (United States)


    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ∼0.1 dex for GCs with metallicities as high as [Fe/H] = −0.3, but the abundances measured for more metal-rich clusters may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na i, Mg i, Al i, Si i, Ca i, Ti i, Ti ii, Sc ii, V i, Cr i, Mn i, Co i, Ni i, Cu i, Y ii, Zr i, Ba ii, La ii, Nd ii, and Eu ii. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe i, Ca i, Si i, Ni i, and Ba ii. The elements that show the greatest differences include Mg i and Zr i. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements.

  5. Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis (United States)

    Yen, Chi-Fu; Sivasankar, Sanjeevi


    Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.

  6. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering. (United States)

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M


    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  7. Isotope enrichment

    International Nuclear Information System (INIS)

    Garbuny, M.


    The invention discloses a method for deriving, from a starting material including an element having a plurality of isotopes, derived material enriched in one isotope of the element. The starting material is deposited on a substrate at less than a critical submonatomic surface density, typically less than 10 16 atoms per square centimeter. The deposit is then selectively irradiated by a laser (maser or electronic oscillator) beam with monochromatic coherent radiation resonant with the one isotope causing the material including the one istope to escape from the substrate. The escaping enriched material is then collected. Where the element has two isotopes, one of which is to be collected, the deposit may be irradiated with radiation resonant with the other isotope and the residual material enriched in the one isotope may be evaporated from the substrate and collected

  8. Application of clustering analysis in the prediction of photovoltaic power generation based on neural network (United States)

    Cheng, K.; Guo, L. M.; Wang, Y. K.; Zafar, M. T.


    In order to select effective samples in the large number of data of PV power generation years and improve the accuracy of PV power generation forecasting model, this paper studies the application of clustering analysis in this field and establishes forecasting model based on neural network. Based on three different types of weather on sunny, cloudy and rainy days, this research screens samples of historical data by the clustering analysis method. After screening, it establishes BP neural network prediction models using screened data as training data. Then, compare the six types of photovoltaic power generation prediction models before and after the data screening. Results show that the prediction model combining with clustering analysis and BP neural networks is an effective method to improve the precision of photovoltaic power generation.

  9. Parallelization and scheduling of data intensive particle physics analysis jobs on clusters of PCs

    CERN Document Server

    Ponce, S


    Summary form only given. Scheduling policies are proposed for parallelizing data intensive particle physics analysis applications on computer clusters. Particle physics analysis jobs require the analysis of tens of thousands of particle collision events, each event requiring typically 200ms processing time and 600KB of data. Many jobs are launched concurrently by a large number of physicists. At a first view, particle physics jobs seem to be easy to parallelize, since particle collision events can be processed independently one from another. However, since large amounts of data need to be accessed, the real challenge resides in making an efficient use of the underlying computing resources. We propose several job parallelization and scheduling policies aiming at reducing job processing times and at increasing the sustainable load of a cluster server. Since particle collision events are usually reused by several jobs, cache based job splitting strategies considerably increase cluster utilization and reduce job ...

  10. Starch and protein analysis of wheat bread enriched with phenolics-rich sprouted wheat flour. (United States)

    Świeca, Michał; Dziki, Dariusz; Gawlik-Dziki, Urszula


    Wheat flour in the bread formula was replaced with sprouted wheat flour (SF) characterized by enhanced nutraceutical properties, at 5%, 10%, 15% and 20% levels. The addition of SF slightly increased the total protein content; however, it decreased their digestibility. Some qualitative and quantitative changes in the electrophoretic pattern of proteins were also observed; especially, in the bands corresponding with 27kDa and 15-17kDa proteins. These results were also confirmed by SE-HPLC technique, where a significant increase in the content of proteins and peptides (molecular masses breads with 20% of SF. Bread enriched with sprouted wheat flour had more resistant starch, but less total starch, compared to control bread. The highest in vitro starch digestibility was determined for the control bread. The studied bread with lowered nutritional value but increased nutritional quality can be used for special groups of consumers (obese, diabetic). Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Mitochondrial capture enriches mito-DNA 100 fold, enabling PCR-free mitogenomics biodiversity analysis

    DEFF Research Database (Denmark)

    Liu, Shanlin; Wang, Xin; Xie, Lin


    Biodiversity analyses based on next-generation sequencing (NGS) platforms have developed by leaps and bounds in recent years. A PCR-free strategy, which can alleviate taxonomic bias, was considered as a promising approach to delivering reliable species compositions of targeted environments...... data is highly demanding on computing resources. Here, we present a mitogenome enrichment pipeline via a gene capture chip that was designed by virtue of the mitogenome sequences of the 1000 Insect Transcriptome Evolution project (1KITE, A mock sample containing 49 species was used...... in abundance. However, the frequencies of input taxa were largely maintained after capture (R2 = 0.81). We suggest that our mitogenome capture approach coupled with PCR-free shotgun sequencing could provide ecological researchers an efficient NGS method to deliver reliable biodiversity assessment....

  12. The Ford Nuclear Reactor demonstration project for the evaluation and analysis of low enrichment fuel

    International Nuclear Information System (INIS)

    Kerr, W.; King, J.S.; Lee, J.C.; Martin, W.R.; Wehe, D.K.


    The whole-core LEU fuel demonstration project at the University of Michigan was begun in 1979 as part of the Reduced Enrichment Research and Test Reactor (RERTR) Program at Argonne National Laboratory. An LEU fuel design was selected which would produce minimum perturbations in the neutronic, operations, and safety characteristics of the 2-MW Ford Nuclear Reactor (FNR). Initial criticality with a full LEU core on December 8, 1981, was followed by low- and full-power testing of the fresh LEU core, transitional operation with mixed HEU-LEU configurations, and establishment of full LEU equilibrium core operation. The transition from the HEU to the LEU configurations was achieved with negligible impact on experimental utilization and safe operation of the reactor. 78 refs., 74 figs., 84 tabs

  13. Neutronic analysis of a fuel element with variations in fuel enrichment and burnable poison

    Energy Technology Data Exchange (ETDEWEB)

    Faria, Rochkhudson B. de; Martins, Felipe; Velasquez, Carlos E.; Cardoso, Fabiano; Fortini, Angela; Pereira, Claubia, E-mail:, E-mail: [Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG (Brazil). Departamento de Engenharia Nuclear


    In this work, the goal was to evaluate the neutronic behavior during the fuel burnup changing the amount of burnable poison and fuel enrichment. For these analyses, it was used a 17 x 17 PWR fuel element, simulated using the 238 groups library cross-section collapsed from ENDF/BVII.0 and TRITON module of SCALE 6.0 code system. The results confirmed the effective action of the burnable poison in the criticality control, especially at Beginning Of Cycle (BOC) and in the burnup kinetics, because at the end of the fuel cycle there was a minimal residual amount of neutron absorbers ({sup 155}Gd and {sup 157}Gd), as expected. At the end of the cycle, the fuel element was still critical in all simulated situations, indicating the possibility of extending the fuel burn. (author)

  14. Calculation and analysis for a series of enriched uranium bare sphere critical assemblies

    International Nuclear Information System (INIS)

    Yang Shunhai


    The imported reactor fuel assembly MARIA program system is adapted to CYBER 825 computer in China Institute of Atomic Energy, and extensively used for a series of enriched uranium bare sphere critical assemblies. The MARIA auxiliary program of resonance modification MA is designed for taking account of the effects of resonance fission and absorption on calculated results. By which, the multigroup constants in the library attached to MARIA program are revised based on the U.S. Evaluated Nuclear Data File ENDF/B-IV, the related nuclear data files are replaced. And then, the reactor geometry buckling and multiplication factor are given in output tapes. The accuracy of calculated results is comparable with those of Monte Carlo and Sn method, and the agreement with experiment result is in 1%. (5 refs., 4 figs., 3 tabs.)


    Directory of Open Access Journals (Sweden)

    Gregorius Satia Budhi


    Full Text Available Basketball World has grown rapidly as the time goes on. This is signed by many competition and game all over the world. With the result there are many basketball players with their different playing characteristics. Demand for a coach or scout to look for or search great players to make a solid team as a coach requirement. With this application, a coach or scout will be helped in analyzing in decision making. This application uses Self Organizing Maps algorithm (SOM for Cluster Analysis. The real NBA player data is used for competitive learning or training process and real player data from Indonesian or Petra Christian University Basketball Players is used for testing process. The NBA Player data is prepared through cleaning process and then is transformed into a form that can be processed by SOM Algorithm. After that, the data is clustered with the SOM algorithm. The result of that clusters is displayed into a form that is easy to view and analyze. This result can be saved into a text file. By using the output / result of this application, that are the clusters of NBA player, the user can see the statistics of each cluster. With these cluster statistics coach or scout can predict the statistic and the position of a testing player who is in the same cluster. This information can give a support for the coach or scout to make a decision. Abstract in Bahasa Indonesia : Dunia bola basket telah berkembang dengan pesat seiring dengan berjalannya waktu. Hal ini ditandai dengan munculnya berbagai macam dan jenis kompetisi dan pertandingan baik dunia maupun dalam negeri. Sehingga makin banyak dilahirkannya pemain berbakat dengan berbagai karakteristik permainan yang berbeda. Tuntutan bagi seorang pelatih/pemandu bakat, untuk dapat melihat secara jeli dalam memenuhi kebutuhan tim untuk membentuk tim yang solid. Dengan dibuatnya aplikasi ini, maka akan membantu proses analisis dan pengambilan keputusan bagi pelatih maupun pemandu bakat Aplikasi ini

  16. Enrichment of trace cadmium by soybean protein for the analysis by atomic absorption method

    International Nuclear Information System (INIS)

    Musha, Soichiro; Takahashi, Yoshihisa.


    A method for enrichment of the ppb level of cadmium in water by using the coagulation of soybean protein by adding acids or its complex-forming character with heavy metal ions was investigated. After adding fixed amounts of soybean milk and 2% sodium diethyldithiocarbamate(DDTC) aqueous solution and a suitable amount of delta-gluconic lactone (delta-GL) to a sample solution, the mixture was heated to boiling in order to coagulate the protein. The coagulum(soybean curd) was separated from the suspension by centrifugation and burned to ashes with a low temperature plasma asher. Then the cadmium enriched in it was determined by atomic absorption spectrometry. Various factors such as the pH of the sample solution, the amounts of soybean milk and the collection additives, and the concentration of NaCl in the sample solution on the recovery of cadmium were examined systematically. The best recovery was obtained under the following conditions: To a certain amount of sample solution were added 30 ml of 6.34% soybean milk and a 5 ml of 2% DDTC solution, and its pH was adjusted to 5.50--5.80 by adding the suitable amounts of delta-GL (0.10 g/ml, (0.40--0.80)ml). NaCl in the sample solution tended to decrease the recovery, especially at the concentration of around 10% of NaCl solution. Under the optimum conditions, the recovery of cadmium was about 98%. The proposed method was applied to the determination of cadmium at the ppb level in sample solutions such as water, 3% NaCl solution and artifical sea water. This method was also applied to the determination of cadmium in common and industrial salts. (auth.)

  17. Exploratory Cluster Analysis to Identify Patterns of Chronic Kidney Disease in the 500 Cities Project. (United States)

    Liu, Shelley H; Li, Yan; Liu, Bian


    Chronic kidney disease is a leading cause of death in the United States. We used cluster analysis to explore patterns of chronic kidney disease in 500 of the largest US cities. After adjusting for socio-demographic characteristics, we found that unhealthy behaviors, prevention measures, and health outcomes related to chronic kidney disease differ between cities in Utah and those in the rest of the United States. Cluster analysis can be useful for identifying geographic regions that may have important policy implications for preventing chronic kidney disease.

  18. Person mobility in the design and analysis of cluster-randomized cohort prevention trials. (United States)

    Vuchinich, Sam; Flay, Brian R; Aber, Lawrence; Bickman, Leonard


    Person mobility is an inescapable fact of life for most cluster-randomized (e.g., schools, hospitals, clinic, cities, state) cohort prevention trials. Mobility rates are an important substantive consideration in estimating the effects of an intervention. In cluster-randomized trials, mobility rates are often correlated with ethnicity, poverty and other variables associated with disparity. This raises the possibility that estimated intervention effects may generalize to only the least mobile segments of a population and, thus, create a threat to external validity. Such mobility can also create threats to the internal validity of conclusions from randomized trials. Researchers must decide how to deal with persons who leave study clusters during a trial (dropouts), persons and clusters that do not comply with an assigned intervention, and persons who enter clusters during a trial (late entrants), in addition to the persons who remain for the duration of a trial (stayers). Statistical techniques alone cannot solve the key issues of internal and external validity raised by the phenomenon of person mobility. This commentary presents a systematic, Campbellian-type analysis of person mobility in cluster-randomized cohort prevention trials. It describes four approaches for dealing with dropouts, late entrants and stayers with respect to data collection, analysis and generalizability. The questions at issue are: 1) From whom should data be collected at each wave of data collection? 2) Which cases should be included in the analyses of an intervention effect? and 3) To what populations can trial results be generalized? The conclusions lead to recommendations for the design and analysis of future cluster-randomized cohort prevention trials.

  19. Study on Adaptive Parameter Determination of Cluster Analysis in Urban Management Cases (United States)

    Fu, J. Y.; Jing, C. F.; Du, M. Y.; Fu, Y. L.; Dai, P. P.


    The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object's highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data's spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.


    Directory of Open Access Journals (Sweden)

    J. Y. Fu


    Full Text Available The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object’s highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data’s spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.

  1. Crouch gait patterns defined using k-means cluster analysis are related to underlying clinical pathology. (United States)

    Rozumalski, Adam; Schwartz, Michael H


    In this study a gait classification method was developed and applied to subjects with Cerebral palsy who walk with excessive knee flexion at initial contact. Sagittal plane gait data, simplified using the gait features method, is used as input into a k-means cluster analysis to determine homogeneous groups. Several clinical domains were explored to determine if the clusters are related to underlying pathology. These domains included age, joint range-of-motion, strength, selective motor control, and spasticity. Principal component analysis is used to determine one overall score for each of the multi-joint domains (strength, selective motor control, and spasticity). The current study shows that there are five clusters among children with excessive knee flexion at initial contact. These clusters were labeled, in order of increasing gait pathology: (1) mild crouch with mild equinus, (2) moderate crouch, (3) moderate crouch with anterior pelvic tilt, (4) moderate crouch with equinus, and (5) severe crouch. Further analysis showed that age, range-of-motion, strength, selective motor control, and spasticity were significantly different between the clusters (p<0.001). The general tendency was for the clinical domains to worsen as gait pathology increased. This new classification tool can be used to define homogeneous groups of subjects in crouch gait, which can help guide treatment decisions and outcomes assessment.

  2. Clustering analysis of water distribution systems: identifying critical components and community impacts. (United States)

    Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D


    Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.

  3. Analysis of risk factors for cluster behavior of dental implant failures. (United States)

    Chrcanovic, Bruno Ramos; Kisch, Jenö; Albrektsson, Tomas; Wennerberg, Ann


    Some studies indicated that implant failures are commonly concentrated in few patients. To identify and analyze cluster behavior of dental implant failures among subjects of a retrospective study. This retrospective study included patients receiving at least three implants only. Patients presenting at least three implant failures were classified as presenting a cluster behavior. Univariate and multivariate logistic regression models and generalized estimating equations analysis evaluated the effect of explanatory variables on the cluster behavior. There were 1406 patients with three or more implants (8337 implants, 592 failures). Sixty-seven (4.77%) patients presented cluster behavior, with 56.8% of all implant failures. The intake of antidepressants and bruxism were identified as potential negative factors exerting a statistically significant influence on a cluster behavior at the patient-level. The negative factors at the implant-level were turned implants, short implants, poor bone quality, age of the patient, the intake of medicaments to reduce the acid gastric production, smoking, and bruxism. A cluster pattern among patients with implant failure is highly probable. Factors of interest as predictors for implant failures could be a number of systemic and local factors, although a direct causal relationship cannot be ascertained. © 2017 Wiley Periodicals, Inc.

  4. A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data. (United States)

    Mo, Qianxing; Shen, Ronglai; Guo, Cui; Vannucci, Marina; Chan, Keith S; Hilsenbeck, Susan G


    Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail:

  5. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014. (United States)

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J


    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. The Assessment of Hydrogen Energy Systems for Fuel Cell Vehicles Using Principal Componenet Analysis and Cluster Analysis

    DEFF Research Database (Denmark)

    Ren, Jingzheng; Tan, Shiyu; Dong, Lichun


    and analysis of the hydrogen systems is meaningful for decision makers to select the best scenario. principal component analysis (PCA) has been used to evaluate the integrated performance of different hydrogen energy systems and select the best scenario, and hierarchical cluster analysis (CA) has been used...... for transportation of hydrogen, hydrogen gas tank for the storage of hydrogen at refueling stations, and gaseous hydrogen as power energy for fuel cell vehicles has been recognized as the best scenario. Also, the clustering results calculated by CA are consistent with those determined by PCA, denoting...

  7. Network-based functional enrichment

    Directory of Open Access Journals (Sweden)

    Poirel Christopher L


    Full Text Available Abstract Background Many methods have been developed to infer and reason about molecular interaction networks. These approaches often yield networks with hundreds or thousands of nodes and up to an order of magnitude more edges. It is often desirable to summarize the biological information in such networks. A very common approach is to use gene function enrichment analysis for this task. A major drawback of this method is that it ignores information about the edges in the network being analyzed, i.e., it treats the network simply as a set of genes. In this paper, we introduce a novel method for functional enrichment that explicitly takes network interactions into account. Results Our approach naturally generalizes Fisher’s exact test, a gene set-based technique. Given a function of interest, we compute the subgraph of the network induced by genes annotated to this function. We use the sequence of sizes of the connected components of this sub-network to estimate its connectivity. We estimate the statistical significance of the connectivity empirically by a permutation test. We present three applications of our method: i determine which functions are enriched in a given network, ii given a network and an interesting sub-network of genes within that network, determine which functions are enriched in the sub-network, and iii given two networks, determine the functions for which the connectivity improves when we merge the second network into the first. Through these applications, we show that our approach is a natural alternative to network clustering algorithms. Conclusions We presented a novel approach to functional enrichment that takes into account the pairwise relationships among genes annotated by a particular function. Each of the three applications discovers highly relevant functions. We used our methods to study biological data from three different organisms. Our results demonstrate the wide applicability of our methods. Our algorithms are

  8. Cluster Analysis on Longitudinal Data of Patients with Adult-Onset Asthma. (United States)

    Ilmarinen, Pinja; Tuomisto, Leena E; Niemelä, Onni; Tommola, Minna; Haanpää, Jussi; Kankaanranta, Hannu

    Previous cluster analyses on asthma are based on cross-sectional data. To identify phenotypes of adult-onset asthma by using data from baseline (diagnostic) and 12-year follow-up visits. The Seinäjoki Adult Asthma Study is a 12-year follow-up study of patients with new-onset adult asthma. K-means cluster analysis was performed by using variables from baseline and follow-up visits on 171 patients to identify phenotypes. Five clusters were identified. Patients in cluster 1 (n = 38) were predominantly nonatopic males with moderate smoking history at baseline. At follow-up, 40% of these patients had developed persistent obstruction but the number of patients with uncontrolled asthma (5%) and rhinitis (10%) was the lowest. Cluster 2 (n = 19) was characterized by older men with heavy smoking history, poor lung function, and persistent obstruction at baseline. At follow-up, these patients were mostly uncontrolled (84%) despite daily use of inhaled corticosteroid (ICS) with add-on therapy. Cluster 3 (n = 50) consisted mostly of nonsmoking females with good lung function at diagnosis/follow-up and well-controlled/partially controlled asthma at follow-up. Cluster 4 (n = 25) had obese and symptomatic patients at baseline/follow-up. At follow-up, these patients had several comorbidities (40% psychiatric disease) and were treated daily with ICS and add-on therapy. Patients in cluster 5 (n = 39) were mostly atopic and had the earliest onset of asthma, the highest blood eosinophils, and FEV 1 reversibility at diagnosis. At follow-up, these patients used the lowest ICS dose but 56% were well controlled. Results can be used to predict outcomes of patients with adult-onset asthma and to aid in development of personalized therapy (NCT02733016 at Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Equilibrium sorptive enrichment on poly(dimethylsiloxane) particles for trace analysis of volatile compounds in gaseous samples

    NARCIS (Netherlands)

    Baltussen, H.A.; David, F.; Sandra, P.J.F.; Janssen, J.G.M.; Cramers, C.A.M.G.


    A novel approach for sample enrichment, namely, equilibrium sorptive enrichment (ESE), is presented. A packed bed of sorption (or partitioning) material is used to enrich volatiles from gaseous samples. Normally, air sampling is stopped before breakthrough occurs, but this approach is not very

  10. Field of Study Choice: Using Conjoint Analysis and Clustering (United States)

    Shtudiner, Ze'ev; Zwilling, Moti; Kantor, Jeffrey


    Purpose: The purpose of this paper is to measure student's preferences regarding various attributes that affect their decision process while choosing a higher education area of study. Design/ Methodology/Approach: The paper exhibits two different models which shed light on the perceived value of each examined area of study: conjoint analysis and…

  11. Analysis of brood sex ratios: implications of offspring clustering

    Czech Academy of Sciences Publication Activity Database

    Krackow, S.; Tkadlec, Emil

    Roc. 50, č. 4 (2001), s. 293-301 ISSN 0340-5443 R&D Projects: GA ČR GA524/01/1316 Institutional research plan: CEZ:AV0Z6093917 Keywords : generalized linear mixed models * random coefficients * multilevel analysis Subject RIV: EG - Zoology Impact factor: 2.353, year: 2001

  12. Vector Nonlinear Time-Series Analysis of Gamma-Ray Burst Datasets on Heterogeneous Clusters

    Directory of Open Access Journals (Sweden)

    Ioana Banicescu


    Full Text Available The simultaneous analysis of a number of related datasets using a single statistical model is an important problem in statistical computing. A parameterized statistical model is to be fitted on multiple datasets and tested for goodness of fit within a fixed analytical framework. Definitive conclusions are hopefully achieved by analyzing the datasets together. This paper proposes a strategy for the efficient execution of this type of analysis on heterogeneous clusters. Based on partitioning processors into groups for efficient communications and a dynamic loop scheduling approach for load balancing, the strategy addresses the variability of the computational loads of the datasets, as well as the unpredictable irregularities of the cluster environment. Results from preliminary tests of using this strategy to fit gamma-ray burst time profiles with vector functional coefficient autoregressive models on 64 processors of a general purpose Linux cluster demonstrate the effectiveness of the strategy.

  13. Mental State Talk Structure in Children’s Narratives: A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Giuliana Pinto


    Full Text Available This study analysed children’s Theory of Mind (ToM as assessed by mental state talk in oral narratives. We hypothesized that the children’s mental state talk in narratives has an underlying structure, with specific terms organized in clusters. Ninety-eight children attending the last year of kindergarten were asked to tell a story twice, at the beginning and at the end of the school year. Mental state talk was analysed by identifying terms and expressions referring to perceptual, physiological, emotional, willingness, cognitive, moral, and sociorelational states. The cluster analysis showed that children’s mental state talk is organized in two main clusters: perceptual states and affective states. Results from the study confirm the feasibility of narratives as an outlet to inquire mental state talk and offer a more fine-grained analysis of mental state talk structure.

  14. Selenium-Enriched Foods Are More Effective at Increasing Glutathione Peroxidase (GPx) Activity Compared with Selenomethionine: A Meta-Analysis (United States)

    Bermingham, Emma N.; Hesketh, John E.; Sinclair, Bruce R.; Koolaard, John P.; Roy, Nicole C.


    Selenium may play a beneficial role in multi-factorial illnesses with genetic and environmental linkages via epigenetic regulation in part via glutathione peroxidase (GPx) activity. A meta-analysis was undertaken to quantify the effects of dietary selenium supplementation on the activity of overall GPx activity in different tissues and animal species and to compare the effectiveness of different forms of dietary selenium. GPx activity response was affected by both the dose and form of selenium (p selenium supplementation on GPx activity (p selenium supply include red blood cells, kidney and muscle. The meta-analysis identified that for animal species selenium-enriched foods were more effective than selenomethionine at increasing GPx activity. PMID:25268836

  15. Evaluation of Portland cement from X-ray diffraction associated with cluster analysis

    International Nuclear Information System (INIS)

    Gobbo, Luciano de Andrade; Montanheiro, Tarcisio Jose; Montanheiro, Filipe; Sant'Agostino, Lilia Mascarenhas


    The Brazilian cement industry produced 64 million tons of cement in 2012, with noteworthy contribution of CP-II (slag), CP-III (blast furnace) and CP-IV (pozzolanic) cements. The industrial pole comprises about 80 factories that utilize raw materials of different origins and chemical compositions that require enhanced analytical technologies to optimize production in order to gain space in the growing consumer market in Brazil. This paper assesses the sensitivity of mineralogical analysis by X-ray diffraction associated with cluster analysis to distinguish different kinds of cements with different additions. This technique can be applied, for example, in the prospection of different types of limestone (calcitic, dolomitic and siliceous) as well as in the qualification of different clinkers. The cluster analysis does not require any specific knowledge of the mineralogical composition of the diffractograms to be clustered; rather, it is based on their similarity. The materials tested for addition have different origins: fly ashes from different power stations from South Brazil and slag from different steel plants in the Southeast. Cement with different additions of limestone and white Portland cement were also used. The Rietveld method of qualitative and quantitative analysis was used for measuring the results generated by the cluster analysis technique. (author)

  16. Identifying influential individuals on intensive care units: using cluster analysis to explore culture. (United States)

    Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson


    The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.

  17. Quantitative Secretome Analysis of Activated Jurkat Cells Using Click Chemistry-Based Enrichment of Secreted Glycoproteins. (United States)

    Witzke, Kathrin E; Rosowski, Kristin; Müller, Christian; Ahrens, Maike; Eisenacher, Martin; Megger, Dominik A; Knobloch, Jürgen; Koch, Andrea; Bracht, Thilo; Sitek, Barbara


    Quantitative secretome analyses are a high-performance tool for the discovery of physiological and pathophysiological changes in cellular processes. However, serum supplements in cell culture media limit secretome analyses, but serum depletion often leads to cell starvation and consequently biased results. To overcome these limiting factors, we investigated a model of T cell activation (Jurkat cells) and performed an approach for the selective enrichment of secreted proteins from conditioned medium utilizing metabolic marking of newly synthesized glycoproteins. Marked glycoproteins were labeled via bioorthogonal click chemistry and isolated by affinity purification. We assessed two labeling compounds conjugated with either biotin or desthiobiotin and the respective secretome fractions. 356 proteins were quantified using the biotin probe and 463 using desthiobiotin. 59 proteins were found differentially abundant (adjusted p-value ≤0.05, absolute fold change ≥1.5) between inactive and activated T cells using the biotin method and 86 using the desthiobiotin approach, with 31 mutual proteins cross-verified by independent experiments. Moreover, we analyzed the cellular proteome of the same model to demonstrate the benefit of secretome analyses and provide comprehensive data sets of both. 336 proteins (61.3%) were quantified exclusively in the secretome. Data are available via ProteomeXchange with identifier PXD004280.

  18. Extending the input–output energy balance methodology in agriculture through cluster analysis

    International Nuclear Information System (INIS)

    Bojacá, Carlos Ricardo; Casilimas, Héctor Albeiro; Gil, Rodrigo; Schrevens, Eddie


    The input–output balance methodology has been applied to characterize the energy balance of agricultural systems. This study proposes to extend this methodology with the inclusion of multivariate analysis to reveal particular patterns in the energy use of a system. The objective was to demonstrate the usefulness of multivariate exploratory techniques to analyze the variability found in a farming system and, establish efficiency categories that can be used to improve the energy balance of the system. To this purpose an input–output analysis was applied to the major greenhouse tomato production area in Colombia. Individual energy profiles were built and the k-means clustering method was applied to the production factors. On average, the production system in the study zone consumes 141.8 GJ ha −1 to produce 96.4 GJ ha −1 , resulting in an energy efficiency of 0.68. With the k-means clustering analysis, three clusters of farmers were identified with energy efficiencies of 0.54, 0.67 and 0.78. The most energy efficient cluster grouped 56.3% of the farmers. It is possible to optimize the production system by improving the management practices of those with the lowest energy use efficiencies. Multivariate analysis techniques demonstrated to be a complementary pathway to improve the energy efficiency of a system. -- Highlights: ► An input–output energy balance was estimated for greenhouse tomatoes in Colombia. ► We used the k-means clustering method to classify growers based on their energy use. ► Three clusters of growers were found with energy efficiencies of 0.54, 0.67 and 0.78. ► Overall system optimization is possible by improving the energy use of the less efficient.

  19. Accommodating error analysis in comparison and clustering of molecular fingerprints.


    Salamon, H.; Segal, M. R.; Ponce de Leon, A.; Small, P. M.


    Molecular epidemiologic studies of infectious diseases rely on pathogen genotype comparisons, which usually yield patterns comprising sets of DNA fragments (DNA fingerprints). We use a highly developed genotyping system, IS6110-based restriction fragment length polymorphism analysis of Mycobacterium tuberculosis, to develop a computational method that automates comparison of l