WorldWideScience

Sample records for platforms cluster analysis

  1. Application of microarray analysis on computer cluster and cloud platforms.

    Science.gov (United States)

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  2. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

    Science.gov (United States)

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

    2005-05-01

    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

  3. WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

    Science.gov (United States)

    Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

    2011-01-17

    Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.

  4. An integrated biomedical knowledge extraction and analysis platform: using federated search and document clustering technology.

    Science.gov (United States)

    Taylor, Donald P

    2007-01-01

    High content screening (HCS) requires time-consuming and often complex iterative information retrieval and assessment approaches to optimally conduct drug discovery programs and biomedical research. Pre- and post-HCS experimentation both require the retrieval of information from public as well as proprietary literature in addition to structured information assets such as compound libraries and projects databases. Unfortunately, this information is typically scattered across a plethora of proprietary bioinformatics tools and databases and public domain sources. Consequently, single search requests must be presented to each information repository, forcing the results to be manually integrated for a meaningful result set. Furthermore, these bioinformatics tools and data repositories are becoming increasingly complex to use; typically they fail to allow for more natural query interfaces. Vivisimo has developed an enterprise software platform to bridge disparate silos of information. The platform automatically categorizes search results into descriptive folders without the use of taxonomies to drive the categorization. A new approach to information retrieval for HCS experimentation is proposed.

  5. Clustering analysis

    International Nuclear Information System (INIS)

    Romli

    1997-01-01

    Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods

  6. Cluster analysis

    CERN Document Server

    Everitt, Brian S; Leese, Morven; Stahl, Daniel

    2011-01-01

    Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons

  7. Cluster analysis

    OpenAIRE

    Mucha, Hans-Joachim; Sofyan, Hizir

    2000-01-01

    As an explorative technique, duster analysis provides a description or a reduction in the dimension of the data. It classifies a set of observations into two or more mutually exclusive unknown groups based on combinations of many variables. Its aim is to construct groups in such a way that the profiles of objects in the same groups are relatively homogenous whereas the profiles of objects in different groups are relatively heterogeneous. Clustering is distinct from classification techniques, ...

  8. A novel model for Time-Series Data Clustering Based on piecewise SVD and BIRCH for Stock Data Analysis on Hadoop Platform

    Directory of Open Access Journals (Sweden)

    Ibgtc Bowala

    2017-06-01

    Full Text Available With the rapid growth of financial markets, analyzers are paying more attention on predictions. Stock data are time series data, with huge amounts. Feasible solution for handling the increasing amount of data is to use a cluster for parallel processing, and Hadoop parallel computing platform is a typical representative. There are various statistical models for forecasting time series data, but accurate clusters are a pre-requirement. Clustering analysis for time series data is one of the main methods for mining time series data for many other analysis processes. However, general clustering algorithms cannot perform clustering for time series data because series data has a special structure and a high dimensionality has highly co-related values due to high noise level. A novel model for time series clustering is presented using BIRCH, based on piecewise SVD, leading to a novel dimension reduction approach. Highly co-related features are handled using SVD with a novel approach for dimensionality reduction in order to keep co-related behavior optimal and then use BIRCH for clustering. The algorithm is a novel model that can handle massive time series data. Finally, this new model is successfully applied to real stock time series data of Yahoo finance with satisfactory results.

  9. Dielectric spectroscopy platform to measure MCF10A epithelial cell aggregation as a model for spheroidal cell cluster analysis.

    Science.gov (United States)

    Heileman, K L; Tabrizian, M

    2017-05-02

    3-Dimensional cell cultures are more representative of the native environment than traditional cell cultures on flat substrates. As a result, 3-dimensional cell cultures have emerged as a very valuable model environment to study tumorigenesis, organogenesis and tissue regeneration. Many of these models encompass the formation of cell aggregates, which mimic the architecture of tumor and organ tissue. Dielectric impedance spectroscopy is a non-invasive, label free and real time technique, overcoming the drawbacks of established techniques to monitor cell aggregates. Here we introduce a platform to monitor cell aggregation in a 3-dimensional extracellular matrix using dielectric spectroscopy. The MCF10A breast epithelial cell line serves as a model for cell aggregation. The platform maintains sterile conditions during the multi-day assay while allowing continuous dielectric spectroscopy measurements. The platform geometry optimizes dielectric measurements by concentrating cells within the electrode sensing region. The cells show a characteristic dielectric response to aggregation which corroborates with finite element analysis computer simulations. By fitting the experimental dielectric spectra to the Cole-Cole equation, we demonstrated that the dispersion intensity Δε and the characteristic frequency f c are related to cell aggregate growth. In addition, microscopy can be performed directly on the platform providing information about cell position, density and morphology. This platform could yield many applications for studying the electrophysiological activity of cell aggregates.

  10. Shot loading platform analysis

    International Nuclear Information System (INIS)

    Norman, B.F.

    1994-01-01

    This document provides the wind/seismic analysis and evaluation for the shot loading platform. Hand calculations were used for the analysis. AISC and UBC load factors were used in this evaluation. The results show that the actual loads are under the allowable loads and all requirements are met

  11. Cluster analysis for applications

    CERN Document Server

    Anderberg, Michael R

    1973-01-01

    Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o

  12. Marketing research cluster analysis

    Directory of Open Access Journals (Sweden)

    Marić Nebojša

    2002-01-01

    Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.

  13. Marketing research cluster analysis

    OpenAIRE

    Marić Nebojša

    2002-01-01

    One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.

  14. a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data

    Science.gov (United States)

    Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.

    2017-09-01

    Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.

  15. 2009 Analysis Platform Review Report

    Energy Technology Data Exchange (ETDEWEB)

    Ferrell, John [Office of Energy Efficiency and Renewable Energy (EERE), Washington, DC (United States

    2009-12-01

    This document summarizes the recommendations and evaluations provided by an independent external panel of experts at the U.S. Department of Energy Biomass Program’s Analysis platform review meeting, held on February 18, 2009, at the Marriott Residence Inn, National Harbor, Maryland.

  16. Comprehensive cluster analysis with Transitivity Clustering.

    Science.gov (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  17. [Cluster analysis in biomedical researches].

    Science.gov (United States)

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  18. Extracting Synthetic Multi-Cluster Platform Configurations from Grid'5000 for Driving Simulation Experiments

    OpenAIRE

    Suter , Frédéric; Casanova , Henri

    2007-01-01

    This report presents a collection of synthetic but realistic distributed computing platform configurations. These configurations are intended for simulation experiments in the study of parallel applications on multi-cluster platforms.

  19. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  20. Microneedle Platforms for Cell Analysis

    KAUST Repository

    Kavaldzhiev, Mincho

    2017-01-01

    to the development of micro-needle platforms that offer customized fabrication and new capabilities for enhanced cell analyses. The highest degree of geometrical flexibility is achieved with 3D printed micro-needles, which enable optimizing the topographical stress

  1. MetaABC--an integrated metagenomics platform for data adjustment, binning and clustering.

    Science.gov (United States)

    Su, Chien-Hao; Hsu, Ming-Tsung; Wang, Tse-Yi; Chiang, Sufeng; Cheng, Jen-Hao; Weng, Francis C; Kao, Cheng-Yan; Wang, Daryi; Tsai, Huai-Kuang

    2011-08-15

    MetaABC is a metagenomic platform that integrates several binning tools coupled with methods for removing artifacts, analyzing unassigned reads and controlling sampling biases. It allows users to arrive at a better interpretation via series of distinct combinations of analysis tools. After execution, MetaABC provides outputs in various visual formats such as tables, pie and bar charts as well as clustering result diagrams. MetaABC source code and documentation are available at http://bits2.iis.sinica.edu.tw/MetaABC/ CONTACT: dywang@gate.sinica.edu.tw; hktsai@iis.sinica.edu.tw Supplementary data are available at Bioinformatics online.

  2. Cluster analysis of track structure

    International Nuclear Information System (INIS)

    Michalik, V.

    1991-01-01

    One of the possibilities of classifying track structures is application of conventional partition techniques of analysis of multidimensional data to the track structure. Using these cluster algorithms this paper attempts to find characteristics of radiation reflecting the spatial distribution of ionizations in the primary particle track. An absolute frequency distribution of clusters of ionizations giving the mean number of clusters produced by radiation per unit of deposited energy can serve as this characteristic. General computation techniques used as well as methods of calculations of distributions of clusters for different radiations are discussed. 8 refs.; 5 figs

  3. Microneedle Platforms for Cell Analysis

    KAUST Repository

    Kavaldzhiev, Mincho

    2017-11-01

    Micro-needle platforms are the core components of many recent drug delivery and gene-editing techniques, which allow for intracellular access, controlled cell membrane stress or mechanical trapping of the nucleus. This dissertation work is devoted to the development of micro-needle platforms that offer customized fabrication and new capabilities for enhanced cell analyses. The highest degree of geometrical flexibility is achieved with 3D printed micro-needles, which enable optimizing the topographical stress environment for cells and cell populations of any size. A fabrication process for 3D-printed micro-needles has been developed as well as a metal coating technique based on standard sputter deposition. This extends the functionalities of the platforms by electrical as well as magnetic features. The micro-needles have been tested on human colon cancer cells (HCT116), showing a high degree of biocompatibility of the platform. Moreover, the capabilities of the 3D-printed micro-needles have been explored for drug delivery via the well-established electroporation technique, by coating the micro-needles with gold. Antibodies and fluorescent dyes have been delivered to HCT116 cells and human embryonic kidney cells with a very high transfection rate up to 90%. In addition, the 3D-printed electroporation platform enables delivery of molecules to suspended cells or adherent cells, with or without electroporation buffer solution, and at ultra-low voltages of 2V. In order to provide a micro-needle platform that exploits existing methods for mass fabrication a custom designed template-based process has been developed. It has been used for the production of gold, iron, nickel and poly-pyrrole micro-needles on silicon and glass substrates. A novel delivery method is introduced that activates the micro-needles by electromagnetic induction, which enables to wirelessly gain intracellular access. The method has been successfully tested on HCT116 cells in culture, where a time

  4. 2011 Biomass Program Platform Peer Review: Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Haq, Zia [Office of Energy Efficiency and Renewable Energy (EERE), Washington, DC (United States)

    2012-02-01

    This document summarizes the recommendations and evaluations provided by an independent external panel of experts at the 2011 U.S. Department of Energy Biomass Program’s Analysis Platform Review meeting.

  5. Parallel single-cell analysis microfluidic platform

    NARCIS (Netherlands)

    van den Brink, Floris Teunis Gerardus; Gool, Elmar; Frimat, Jean-Philippe; Bomer, Johan G.; van den Berg, Albert; le Gac, Severine

    2011-01-01

    We report a PDMS microfluidic platform for parallel single-cell analysis (PaSCAl) as a powerful tool to decipher the heterogeneity found in cell populations. Cells are trapped individually in dedicated pockets, and thereafter, a number of invasive or non-invasive analysis schemes are performed.

  6. Cysteine as a ligand platform in the biosynthesis of the FeFe hydrogenase H cluster.

    Science.gov (United States)

    Suess, Daniel L M; Bürstel, Ingmar; De La Paz, Liliana; Kuchenreuther, Jon M; Pham, Cindy C; Cramer, Stephen P; Swartz, James R; Britt, R David

    2015-09-15

    Hydrogenases catalyze the redox interconversion of protons and H2, an important reaction for a number of metabolic processes and for solar fuel production. In FeFe hydrogenases, catalysis occurs at the H cluster, a metallocofactor comprising a [4Fe-4S]H subcluster coupled to a [2Fe]H subcluster bound by CO, CN(-), and azadithiolate ligands. The [2Fe]H subcluster is assembled by the maturases HydE, HydF, and HydG. HydG is a member of the radical S-adenosyl-L-methionine family of enzymes that transforms Fe and L-tyrosine into an [Fe(CO)2(CN)] synthon that is incorporated into the H cluster. Although it is thought that the site of synthon formation in HydG is the "dangler" Fe of a [5Fe] cluster, many mechanistic aspects of this chemistry remain unresolved including the full ligand set of the synthon, how the dangler Fe initially binds to HydG, and how the synthon is released at the end of the reaction. To address these questions, we herein show that L-cysteine (Cys) binds the auxiliary [4Fe-4S] cluster of HydG and further chelates the dangler Fe. We also demonstrate that a [4Fe-4S]aux[CN] species is generated during HydG catalysis, a process that entails the loss of Cys and the [Fe(CO)2(CN)] fragment; on this basis, we suggest that Cys likely completes the coordination sphere of the synthon. Thus, through spectroscopic analysis of HydG before and after the synthon is formed, we conclude that Cys serves as the ligand platform on which the synthon is built and plays a role in both Fe(2+) binding and synthon release.

  7. Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Jyh-Da Wei

    2017-08-01

    Full Text Available High-end graphics processing units (GPUs, such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1, which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs. Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform. Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.

  8. Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments.

    Science.gov (United States)

    Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu

    2017-01-01

    High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.

  9. ClusterCAD: a computational platform for type I modular polyketide synthase design

    DEFF Research Database (Denmark)

    Eng, Clara H.; Backman, Tyler W. H.; Bailey, Constance B.

    2018-01-01

    barrier to the design of active variants, and identifying strategies to reliably construct functional PKS chimeras remains an active area of research. In this work, we formalize a paradigm for the design of PKS chimeras and introduce ClusterCAD as a computational platform to streamline and simplify...

  10. Cluster analysis for portfolio optimization

    OpenAIRE

    Vincenzo Tola; Fabrizio Lillo; Mauro Gallegati; Rosario N. Mantegna

    2005-01-01

    We consider the problem of the statistical uncertainty of the correlation matrix in the optimization of a financial portfolio. We show that the use of clustering algorithms can improve the reliability of the portfolio in terms of the ratio between predicted and realized risk. Bootstrap analysis indicates that this improvement is obtained in a wide range of the parameters N (number of assets) and T (investment horizon). The predicted and realized risk level and the relative portfolio compositi...

  11. MycoCAP - Mycobacterium Comparative Analysis Platform.

    Science.gov (United States)

    Choo, Siew Woh; Ang, Mia Yang; Dutta, Avirup; Tan, Shi Yang; Siow, Cheuk Chuen; Heydari, Hamed; Mutha, Naresh V R; Wee, Wei Yee; Wong, Guat Jah

    2015-12-15

    Mycobacterium spp. are renowned for being the causative agent of diseases like leprosy, Buruli ulcer and tuberculosis in human beings. With more and more mycobacterial genomes being sequenced, any knowledge generated from comparative genomic analysis would provide better insights into the biology, evolution, phylogeny and pathogenicity of this genus, thus helping in better management of diseases caused by Mycobacterium spp.With this motivation, we constructed MycoCAP, a new comparative analysis platform dedicated to the important genus Mycobacterium. This platform currently provides information of 2108 genome sequences of at least 55 Mycobacterium spp. A number of intuitive web-based tools have been integrated in MycoCAP particularly for comparative analysis including the PGC tool for comparison between two genomes, PathoProT for comparing the virulence genes among the Mycobacterium strains and the SuperClassification tool for the phylogenic classification of the Mycobacterium strains and a specialized classification system for strains of Mycobacterium abscessus. We hope the broad range of functions and easy-to-use tools provided in MycoCAP makes it an invaluable analysis platform to speed up the research discovery on mycobacteria for researchers. Database URL: http://mycobacterium.um.edu.my.

  12. Novel Biochip Platform for Nucleic Acid Analysis

    Directory of Open Access Journals (Sweden)

    Juan J. Diaz-Mochon

    2012-06-01

    Full Text Available This manuscript describes the use of a novel biochip platform for the rapid analysis/identification of nucleic acids, including DNA and microRNAs, with very high specificity. This approach combines a unique dynamic chemistry approach for nucleic acid testing and analysis developed by DestiNA Genomics with the STMicroelectronics In-Check platform, which comprises two microfluidic optimized and independent PCR reaction chambers, and a sequential microarray area for nucleic acid capture and identification by fluorescence. With its compact bench-top “footprint” requiring only a single technician to operate, the biochip system promises to transform and expand routine clinical diagnostic testing and screening for genetic diseases, cancers, drug toxicology and heart disease, as well as employment in the emerging companion diagnostics market.

  13. Cost (non)-recovery by platform technology facilities in the Bio21 Cluster.

    Science.gov (United States)

    Gibbs, Gerard; Clark, Stella; Quinn, Julieanne; Gleeson, Mary Joy

    2010-04-01

    Platform technologies (PT) are techniques or tools that enable a range of scientific investigations and are critical to today's advanced technology research environment. Once installed, they require specialized staff for their operations, who in turn, provide expertise to researchers in designing appropriate experiments. Through this pipeline, research outputs are raised to the benefit of the researcher and the host institution. Platform facilities provide access to instrumentation and expertise for a wide range of users beyond the host institution, including other academic and industry users. To maximize the return on these substantial public investments, this wider access needs to be supported. The question of support and the mechanisms through which this occurs need to be established based on a greater understanding of how PT facilities operate. This investigation was aimed at understanding if and how platform facilities across the Bio21 Cluster meet operating costs. Our investigation found: 74% of platforms surveyed do not recover 100% of direct operating costs and are heavily subsidized by their home institution, which has a vested interest in maintaining the technology platform; platform managers play a major role in establishing the costs and pricing of the facility, normally in a collaborative process with a management committee or institutional accountant; and most facilities have a three-tier pricing structure recognizing internal academic, external academic, and commercial clients.

  14. Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing

    Science.gov (United States)

    Tang, Jingyin; Matyas, Corene J.

    2018-02-01

    Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.

  15. DATA ANALYSIS BY SQL-MAPREDUCE PLATFORM

    Directory of Open Access Journals (Sweden)

    A. A. A. Dergachev

    2014-01-01

    Full Text Available The paper deals with the problems related to the usage of relational database management system (RDBMS, mainly in the analysis of large data content, including data analysis based on web services in the Internet. A solution of these problems can be represented as a web-oriented distributed system of the data analysis with the processor of service requests as an executive kernel. The functions of such system are similar to the functions of relational DBMS, only with the usage of web services. The processor of service requests is responsible for planning of data analysis web services calls and their execution. The efficiency of such web-oriented system depends on the efficiency of web services calls plan and their program implementation where the basic element is the facilities of analyzed data storage – relational DBMS. The main attention is given to extension of functionality of relational DBMS for the analysis of large data content, in particular, the perspective estimation of web services data analysis implementation on the basis of SQL/MapReduce platform. With a view of obtaining this result, analytical task was chosen as an application-oriented part, typical for data analysis in various social networks and web portals, based on data analysis of users’ attendance. In the practical part of this research the algorithm for planning of web services calls was implemented for application-oriented task solution. SQL/MapReduce platform efficiency is confirmed by experimental results that show the opportunity of effective application for data analysis web services.

  16. VR-Cluster: Dynamic Migration for Resource Fragmentation Problem in Virtual Router Platform

    Directory of Open Access Journals (Sweden)

    Xianming Gao

    2016-01-01

    Full Text Available Network virtualization technology is regarded as one of gradual schemes to network architecture evolution. With the development of network functions virtualization, operators make lots of effort to achieve router virtualization by using general servers. In order to ensure high performance, virtual router platform usually adopts a cluster of general servers, which can be also regarded as a special cloud computing environment. However, due to frequent creation and deletion of router instances, it may generate lots of resource fragmentation to prevent platform from establishing new router instances. In order to solve “resource fragmentation problem,” we firstly propose VR-Cluster, which introduces two extra function planes including switching plane and resource management plane. Switching plane is mainly used to support seamless migration of router instances without packet loss; resource management plane can dynamically move router instances from one server to another server by using VR-mapping algorithms. Besides, three VR-mapping algorithms including first-fit mapping algorithm, best-fit mapping algorithm, and worst-fit mapping algorithm are proposed based on VR-Cluster. At last, we establish VR-Cluster protosystem by using general X86 servers, evaluate its migration time, and further analyze advantages and disadvantages of our proposed VR-mapping algorithms to solve resource fragmentation problem.

  17. Cluster analysis in phenotyping a Portuguese population.

    Science.gov (United States)

    Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J

    2015-09-03

    Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.

  18. Scientific data analysis on data-parallel platforms.

    Energy Technology Data Exchange (ETDEWEB)

    Ulmer, Craig D.; Bayer, Gregory W.; Choe, Yung Ryn; Roe, Diana C.

    2010-09-01

    As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated analysis algorithms in the computing platforms' storage systems. Data Warehouse Appliances (DWAs) are attractive for this work, due to their ability to store and process massive datasets efficiently. While DWAs have been utilized effectively in data-mining and informatics applications, they remain largely unproven in scientific workloads. In this paper we present our experiences in adapting two mesh analysis algorithms to function on five different DWA architectures: two Netezza database appliances, an XtremeData dbX database, a LexisNexis DAS, and multiple Hadoop MapReduce clusters. The main contribution of this work is insight into the differences between these DWAs from a user's perspective. In addition, we present performance measurements for ten DWA systems to help understand the impact of different architectural trade-offs in these systems.

  19. 3D Viewer Platform of Cloud Clustering Management System: Google Map 3D

    Science.gov (United States)

    Choi, Sung-Ja; Lee, Gang-Soo

    The new management system of framework for cloud envrionemnt is needed by the platfrom of convergence according to computing environments of changes. A ISV and small business model is hard to adapt management system of platform which is offered from super business. This article suggest the clustering management system of cloud computing envirionments for ISV and a man of enterprise in small business model. It applies the 3D viewer adapt from map3D & earth of google. It is called 3DV_CCMS as expand the CCMS[1].

  20. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  1. Comprehensive studies of hydrogeochemical processes and quality status of groundwater with tools of cluster, grouping analysis, and fuzzy set method using GIS platform: a case study of Dalcheon in Ulsan City, Korea.

    Science.gov (United States)

    Venkatramanan, S; Chung, S Y; Rajesh, R; Lee, S Y; Ramkumar, T; Prasanna, M V

    2015-08-01

    This research aimed at developing comprehensive assessments of physicochemical quality of groundwater for drinking and irrigation purposes at Dalcheon in Ulsan City, Korea. The mean concentration of major ions represented as follows: Ca (94.3 mg/L) > Mg (41.7 mg/L) > Na (19.2 mg/L) > K (3.2 mg/L) for cations and SO4 (351 mg/L) > HCO3 (169 mg/L) > Cl (19 mg/L) for anions. Thematic maps for physicochemical parameters of groundwater were prepared, classified, weighted, and integrated in GIS method with fuzzy logic. The maps exhibited that suitable zone of drinking and irrigation purpose occupied in SE, NE, and NW sectors. The undesirable zone of drinking purpose was observed in SW and central parts and that of irrigation was in the western part of the study area. This was influenced by improperly treated effluents from an abandoned iron ore mine, irrigation, and domestic fields. By grouping analysis, groundwater types were classified into Ca(HCO3)2, (Ca,Mg)Cl2, and CaCl2, and CaHCO3 was the most predominant type. Grouping analysis also showed three types of irrigation water such as C1S1, C1S2, and C1S3. C1S3 type of high salinity to low sodium hazard was the most dominant in the study area. Equilibrium processes elucidated the groundwater samples were in the saturated to undersaturated condition with respect to aragonite, calcite, dolomite, and gypsum due to precipitation and deposition processes. Cluster analysis suggested that high contents of SO4 and HCO3 with low Cl was related with water-rock interactions and along with mining impact. This study showed that the effluents discharged from mining waste was the main sources of groundwater quality deterioration.

  2. Robust cluster analysis and variable selection

    CERN Document Server

    Ritter, Gunter

    2014-01-01

    Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of bot

  3. Exact WKB analysis and cluster algebras

    International Nuclear Information System (INIS)

    Iwaki, Kohei; Nakanishi, Tomoki

    2014-01-01

    We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)

  4. Analysis of offshore platforms lifting with fixed pile structure type (fixed platform) based on ASD89

    Science.gov (United States)

    Sugianto, Agus; Indriani, Andi Marini

    2017-11-01

    Platform construction GTS (Gathering Testing Sattelite) is offshore construction platform with fix pile structure type/fixed platform functioning to support the mining of petroleum exploitation. After construction fabrication process platform was moved to barges, then shipped to the installation site. Moving process is generally done by pull or push based on construction design determined when planning. But at the time of lifting equipment/cranes available in the work area then the moving process can be done by lifting so that moving activity can be implemented more quickly of work. This analysis moving process of GTS platform in a different way that is generally done to GTS platform types by lifting using problem is construction reinforcement required, so the construction can be moved by lifting with analyzing and checking structure working stress that occurs due to construction moving process by lifting AISC code standard and analysis using the SAP2000 structure analysis program. The analysis result showed that existing condition cannot be moved by lifting because stress ratio is above maximum allowable value that is 0.950 (AISC-ASD89). Overstress occurs on the member 295 and 324 with stress ratio value 0.97 and 0.95 so that it is required structural reinforcement. Box plate aplication at both members so that it produces stress ratio values 0.78 at the member 295 and stress ratio of 0.77 at the member 324. These results indicate that the construction have qualified structural reinforcement for being moved by lifting.

  5. Architectural analysis for wirelessly powered computing platforms

    NARCIS (Netherlands)

    Kapoor, A.; Pineda de Gyvez, J.

    2013-01-01

    We present a design framework for wirelessly powered generic computing platforms that takes into account various system parameters in response to a time-varying energy source. These parameters are the charging profile of the energy source, computing speed (fclk), digital supply voltage (VDD), energy

  6. An integrated platform for biomolecule interaction analysis

    Science.gov (United States)

    Jan, Chia-Ming; Tsai, Pei-I.; Chou, Shin-Ting; Lee, Shu-Sheng; Lee, Chih-Kung

    2013-02-01

    We developed a new metrology platform which can detect real-time changes in both a phase-interrogation mode and intensity mode of a SPR (surface plasmon resonance). We integrated a SPR and ellipsometer to a biosensor chip platform to create a new biomolecular interaction measurement mechanism. We adopted a conductive ITO (indium-tinoxide) film to the bio-sensor platform chip to expand the dynamic range and improve measurement accuracy. The thickness of the conductive film and the suitable voltage constants were found to enhance performance. A circularly polarized ellipsometry configuration was incorporated into the newly developed platform to measure the label-free interactions of recombinant human C-reactive protein (CRP) with immobilized biomolecule target monoclonal human CRP antibody at various concentrations. CRP was chosen as it is a cardiovascular risk biomarker and is an acute phase reactant as well as a specific prognostic indicator for inflammation. We found that the sensitivity of a phaseinterrogation SPR is predominantly dependent on the optimization of the sample incidence angle. The effect of the ITO layer effective index under DC and AC effects as well as an optimal modulation were experimentally performed and discussed. Our experimental results showed that the modulated dynamic range for phase detection was 10E-2 RIU based on a current effect and 10E-4 RIU based on a potential effect of which a 0.55 (°/RIU) measurement was found by angular-interrogation. The performance of our newly developed metrology platform was characterized to have a higher sensitivity and less dynamic range when compared to a traditional full-field measurement system.

  7. Study on integrated design and analysis platform of NPP

    International Nuclear Information System (INIS)

    Lu Dongsen; Gao Zuying; Zhou Zhiwei

    2001-01-01

    Many calculation software have been developed to nuclear system's design and safety analysis, such as structure design software, fuel design and manage software, thermal hydraulic analysis software, severe accident simulation software, etc. This study integrates those software to a platform, develops visual modeling tool for Retran, NGFM90. And in this platform, a distribution calculation method is also provided for couple calculation between different software. The study will improve the design and analysis of NPP

  8. Cluster analysis of obesity and asthma phenotypes.

    Directory of Open Access Journals (Sweden)

    E Rand Sutherland

    Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals

  9. Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

    Science.gov (United States)

    Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

    2014-11-01

    Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. Mining the archives: a cross-platform analysis of gene ...

    Science.gov (United States)

    Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for genomic research into the molecular basis of disease. However, use of FFPE samples in gene expression studies has been limited by technical challenges resulting from degradation of nucleic acids. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues using two DNA microarray protocols and two whole transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other three methods by having the highest correlations of differentially expressed genes (DEGs) and best overlap of pathways between FRO and FFPE groups. We next tested the effect of sample time in formalin (18 hours or 3 weeks) on gene expression profiles. Hierarchical clustering of the datasets indicated that test article treatment, and not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18-hour and 3-week FFPE samples compared to FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of age in FFPE block on genomic profiles. RNA-seq analysis of 8-, 19-, and 26-year-old control blocks using the ribo-depletion protocol resulted in comparable quality metrics, inc

  11. Platforms.

    Science.gov (United States)

    Josko, Deborah

    2014-01-01

    The advent of DNA sequencing technologies and the various applications that can be performed will have a dramatic effect on medicine and healthcare in the near future. There are several DNA sequencing platforms available on the market for research and clinical use. Based on the medical laboratory scientist or researcher's needs and taking into consideration laboratory space and budget, one can chose which platform will be beneficial to their institution and their patient population. Although some of the instrument costs seem high, diagnosing a patient quickly and accurately will save hospitals money with fewer hospital stays and targeted treatment based on an individual's genetic make-up. By determining the type of disease an individual has, based on the mutations present or having the ability to prescribe the appropriate antimicrobials based on the knowledge of the organism's resistance patterns, the clinician will be better able to treat and diagnose a patient which ultimately will improve patient outcomes and prognosis.

  12. Modal shapes optimization and feasibility analysis of NFAL platform

    Directory of Open Access Journals (Sweden)

    Bin WEI

    2017-08-01

    Full Text Available In order to avoid friction and scratching between the conveyor and the precision components when conveying object, an compact non-contact acoustic levitation prototype is designed, and the feasibility is theoretically and experimentally verified. The symmetry model is established through kinetic analysis with ANSYS. The modal and the coupled field computation at the central point of the transfer platform are simulated. The simulation results show that pure flexural or mixed flexural wave shapes appear with different wave numbers on the platform. Sweep frequency test is conducted on the compact platform prototype. The levitation experimental results confirm the feasibility of the ultrasound transfer process, the levitation frequency range and the mode of vibration. The theoretical and experimental results show that the optimal design of the modal and the carrying capacity of the driving platform is necessary according to different conditions. The research results provide a reference for the design of the mode and bandwidth of the ultrasonic levitation platform.

  13. Factor Analysis for Clustered Observations.

    Science.gov (United States)

    Longford, N. T.; Muthen, B. O.

    1992-01-01

    A two-level model for factor analysis is defined, and formulas for a scoring algorithm for this model are derived. A simple noniterative method based on decomposition of total sums of the squares and cross-products is discussed and illustrated with simulated data and data from the Second International Mathematics Study. (SLD)

  14. Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy.

    Science.gov (United States)

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

    2014-03-19

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3D-MIP platform when a larger number of cores is available.

  15. Cluster analysis for determining distribution center location

    Science.gov (United States)

    Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian

    2017-12-01

    Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.

  16. Trading strategies in the overnight money market: Correlations and clustering on the e-MID trading platform

    Science.gov (United States)

    Fricke, Daniel

    2012-12-01

    We analyze the correlations in patterns of trading for members of the Italian interbank trading platform e-MID. The trading strategy of a particular member institution is defined as the sequence of (intra-) daily net trading volumes within a certain semester. Based on this definition, we show that there are significant and persistent bilateral correlations between institutions’ trading strategies. In most semesters we find two clusters, with positively (negatively) correlated trading strategies within (between) clusters. We show that the two clusters mostly contain continuous net buyers and net sellers of money, respectively, and that cluster memberships of individual banks are highly persistent. Additionally, we highlight some problems related to our definition of trading strategies. Our findings add further evidence on the fact that preferential lending relationships on the micro-level lead to community structure on the macro-level.

  17. MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data

    Directory of Open Access Journals (Sweden)

    Rader Robert

    2007-06-01

    Full Text Available Abstract Background The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches. Results We have developed the MAss SPECTRometry Analysis System (MASPECTRAS, a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI. Analysis modules include: 1 import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2 peptide validation, 3 clustering of proteins based on Markov Clustering and multiple alignments; and 4 quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio. The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE. MASPECTRAS is freely available at http://genome.tugraz.at/maspectras Conclusion Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community.

  18. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    Science.gov (United States)

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations

  19. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  20. A software platform for the analysis of dermatology images

    Science.gov (United States)

    Vlassi, Maria; Mavraganis, Vlasios; Asvestas, Panteleimon

    2017-11-01

    The purpose of this paper is to present a software platform developed in Python programming environment that can be used for the processing and analysis of dermatology images. The platform provides the capability for reading a file that contains a dermatology image. The platform supports image formats such as Windows bitmaps, JPEG, JPEG2000, portable network graphics, TIFF. Furthermore, it provides suitable tools for selecting, either manually or automatically, a region of interest (ROI) on the image. The automated selection of a ROI includes filtering for smoothing the image and thresholding. The proposed software platform has a friendly and clear graphical user interface and could be a useful second-opinion tool to a dermatologist. Furthermore, it could be used to classify images including from other anatomical parts such as breast or lung, after proper re-training of the classification algorithms.

  1. Fluidics platform and method for sample preparation and analysis

    Science.gov (United States)

    Benner, W. Henry; Dzenitis, John M.; Bennet, William J.; Baker, Brian R.

    2014-08-19

    Herein provided are fluidics platform and method for sample preparation and analysis. The fluidics platform is capable of analyzing DNA from blood samples using amplification assays such as polymerase-chain-reaction assays and loop-mediated-isothermal-amplification assays. The fluidics platform can also be used for other types of assays and analyzes. In some embodiments, a sample in a sealed tube can be inserted directly. The following isolation, detection, and analyzes can be performed without a user's intervention. The disclosed platform may also comprises a sample preparation system with a magnetic actuator, a heater, and an air-drying mechanism, and fluid manipulation processes for extraction, washing, elution, assay assembly, assay detection, and cleaning after reactions and between samples.

  2. ARCHITECTURE OF A SENTIMENT ANALYSIS PLATFORM

    Directory of Open Access Journals (Sweden)

    CRISTIAN BUCUR

    2015-06-01

    Full Text Available A new domain of research evolved in the last decade, called sentiment analysis that tries to extract knowledge from opinionated text documents. The article presents an overview of the domain and present an architecture of a system that could perform sentiment analysis processes. Based on previous researches are presented two methods for performing classification and the results obtained.

  3. Numeric computation and statistical data analysis on the Java platform

    CERN Document Server

    Chekanov, Sergei V

    2016-01-01

    Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...

  4. MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

    Directory of Open Access Journals (Sweden)

    Jana Halčinová

    2014-06-01

    Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.

  5. Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach.

    Science.gov (United States)

    Liang, Muxuan; Li, Zhizhong; Chen, Ting; Zeng, Jianyang

    2015-01-01

    Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for

  6. A Comparative Analysis of MOOC (Massive Open Online Course Platforms

    Directory of Open Access Journals (Sweden)

    Maria CONACHE

    2016-01-01

    Full Text Available The MOOC platforms have known a considerable development in recent years due to the enlargement of online space and the shifting between traditional to virtual activities. These plat-forms made it possible for people almost everywhere to take online academic courses offered by top universities via open access to web and with unlimited participation. Thus, it came naturally to us to address the question what makes them so successful? The purpose of this paper is to report comparatively MOOC platforms in terms of features, based on the user’s implication and demands. First, we chose four relevant lifelong learning platforms and then we structured three main categories for the platforms' qualification, depending on which we built our theory regarding the comparison between them. Our analysis consists of three sets of criteria: business model, course design and popularity among online users. Starting from this perspective, we built a range of representative factors for which we highlighted the major aspects for each plat-form in our comparative research

  7. Advanced analysis of forest fire clustering

    Science.gov (United States)

    Kanevski, Mikhail; Pereira, Mario; Golay, Jean

    2017-04-01

    Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index

  8. Cluster Analysis in Rapeseed (Brassica Napus L.)

    International Nuclear Information System (INIS)

    Mahasi, J.M

    2002-01-01

    With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical

  9. Modeling and analysis of offshore jacket platform

    Digital Repository Service at National Institute of Oceanography (India)

    Mohan, P.; Sidhaarth, K.R.A.; SanilKumar, V.

    analysis is also able to replicate the details of observed behavior of the designed structure. Based on the axial tension/compression and the bending stresses obtained through simulation, the tubular members are designed and the code unity check is done...

  10. Cross-platform validation and analysis environment for particle physics

    Science.gov (United States)

    Chekanov, S. V.; Pogrebnyak, I.; Wilbern, D.

    2017-11-01

    A multi-platform validation and analysis framework for public Monte Carlo simulation for high-energy particle collisions is discussed. The front-end of this framework uses the Python programming language, while the back-end is written in Java, which provides a multi-platform environment that can be run from a web browser and can easily be deployed at the grid sites. The analysis package includes all major software tools used in high-energy physics, such as Lorentz vectors, jet algorithms, histogram packages, graphic canvases, and tools for providing data access. This multi-platform software suite, designed to minimize OS-specific maintenance and deployment time, is used for online validation of Monte Carlo event samples through a web interface.

  11. Tweets clustering using latent semantic analysis

    Science.gov (United States)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  12. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  13. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

    Science.gov (United States)

    Kumar, Sudhir; Stecher, Glen; Li, Michael; Knyaz, Christina; Tamura, Koichiro

    2018-06-01

    The Molecular Evolutionary Genetics Analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.

  14. Multisource Images Analysis Using Collaborative Clustering

    Directory of Open Access Journals (Sweden)

    Pierre Gançarski

    2008-04-01

    Full Text Available The development of very high-resolution (VHR satellite imagery has produced a huge amount of data. The multiplication of satellites which embed different types of sensors provides a lot of heterogeneous images. Consequently, the image analyst has often many different images available, representing the same area of the Earth surface. These images can be from different dates, produced by different sensors, or even at different resolutions. The lack of machine learning tools using all these representations in an overall process constraints to a sequential analysis of these various images. In order to use all the information available simultaneously, we propose a framework where different algorithms can use different views of the scene. Each one works on a different remotely sensed image and, thus, produces different and useful information. These algorithms work together in a collaborative way through an automatic and mutual refinement of their results, so that all the results have almost the same number of clusters, which are statistically similar. Finally, a unique result is produced, representing a consensus among the information obtained by each clustering method on its own image. The unified result and the complementarity of the single results (i.e., the agreement between the clustering methods as well as the disagreement lead to a better understanding of the scene. The experiments carried out on multispectral remote sensing images have shown that this method is efficient to extract relevant information and to improve the scene understanding.

  15. The design and verification of probabilistic safety analysis platform NFRisk

    International Nuclear Information System (INIS)

    Hu Wenjun; Song Wei; Ren Lixia; Qian Hongtao

    2010-01-01

    To increase the technical ability in Probabilistic Safety Analysis (PSA) field in China,it is necessary and important to study and develop indigenous professional PSA platform. Following such principle as 'from structure simplification to modulization to production of cut sets to minimum of cut sets', the algorithms, including simplification algorithm, modulization algorithm, the algorithm of conversion from fault tree to binary decision diagram (BDD), the solving algorithm of cut sets, the minimum algorithm of cut sets, and so on, were designed and developed independently; the design of data management and operation platform was completed all alone; the verification and validation of NFRisk platform based on 3 typical fault trees was finished on our own. (authors)

  16. AZTLAN platform: Mexican platform for analysis and design of nuclear reactors

    International Nuclear Information System (INIS)

    Gomez T, A. M.; Puente E, F.; Del Valle G, E.; Francois L, J. L.; Martin del Campo M, C.; Espinosa P, G.

    2014-10-01

    The Aztlan platform Project is a national initiative led by the Instituto Nacional de Investigaciones Nucleares (ININ) which brings together the main public houses of higher studies in Mexico, such as: Instituto Politecnico Nacional, Universidad Nacional Autonoma de Mexico and Universidad Autonoma Metropolitana in an effort to take a significant step toward the calculation autonomy and analysis that seeks to place Mexico in the medium term in a competitive international level on software issues for analysis of nuclear reactors. This project aims to modernize, improve and integrate the neutron, thermal-hydraulic and thermo-mechanical codes, developed in Mexican institutions, within an integrated platform, developed and maintained by Mexican experts to benefit from the same institutions. This project is financed by the mixed fund SENER-CONACYT of Energy Sustain ability, and aims to strengthen substantially to research institutions, such as educational institutions contributing to the formation of highly qualified human resources in the area of analysis and design of nuclear reactors. As innovative part the project includes the creation of a user group, made up of members of the project institutions as well as the Comision Nacional de Seguridad Nuclear y Salvaguardias, Central Nucleoelectrica de Laguna Verde (CNLV), Secretaria de Energia (Mexico) and Karlsruhe Institute of Technology (Germany) among others. This user group will be responsible for using the software and provide feedback to the development equipment in order that progress meets the needs of the regulator and industry; in this case the CNLV. Finally, in order to bridge the gap between similar developments globally, they will make use of the latest super computing technology to speed up calculation times. This work intends to present to national nuclear community the project, so a description of the proposed methodology is given, as well as the goals and objectives to be pursued for the development of the

  17. ANALYSIS OF FACTORS INFLUENCING TRAVEL CONSUMER SATISFACTION AS REVEALED BY ONLINE COMMUNICATION PLATFORMS

    Directory of Open Access Journals (Sweden)

    Olimpia I. BAN

    2016-08-01

    Full Text Available The objective of the present empirical study is to determine the factors influencing the tourism consumer satisfaction, as it results from the evaluations posted on virtual platforms. The communication platform chosen as study is the Romanian website Amfostacolo.ro. In this case, the travel consumer satisfaction is expressed by the score of the ratings posted on the virtual platform Amfostacolo.ro and the decision to recommend or not the unit / destination. Considering the peculiarities of the communication platform studied, the elements influencing the score indicating satisfaction there can be identified as components of tourism supply and the characteristics of the reviewer. Data processing has been carried out with ordinary least squares (OLS, structural equation modeling (confirmatory factor analysis, path analysis, cluster analisys and polytomous logistic regression. The results broadly confirm the hypotheses, namely that: the type of stay and the age of the reviewer influence the satisfaction of the consumer more than the destination and number of stars of the accommodation, the age group of the reviewer influences the destination yet it is uncertain about the influence of the variables related to the holiday (the type of stay and the number of stars of the accommodation, the meal service influences more than other attributes the consumer satisfaction and the recommendation of the reviewer is influenced by the characteristics related to his person and the holiday consumed.

  18. Micro and nano-platforms for biological cell analysis

    DEFF Research Database (Denmark)

    Svendsen, Winnie Edith; Castillo, Jaime; Moresco, Jacob Lange

    2011-01-01

    In this paper some technological platforms developed for biological cell analysis will be presented and compared to existing systems. In brief, we present a novel micro cell culture chamber based on diffusion feeding of cells, into which cells can be introduced and extracted after culturing using...... from the cells, while passive modifications involve the presence of a peptide nanotube based scaffold for the cell culturing that mimics the in vivo environment. Two applications involving fluorescent in situ hybridization (FISH) analysis and cancer cell sorting are presented, as examples of further...... analysis that can be done after cell culturing. A platform able to automate the entire process from cell culturing to cell analysis by means of simple plug and play of various self-contained, individually fabricated modules is finally described....

  19. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  20. ASAP: An Extensible Platform for State Space Analysis

    DEFF Research Database (Denmark)

    Westergaard, Michael; Evangelista, Sami; Kristensen, Lars Michael

    2009-01-01

    The ASCoVeCo State space Analysis Platform (ASAP) is a tool for performing explicit state space analysis of coloured Petri nets (CPNs) and other formalisms. ASAP supports a wide range of state space reduction techniques and is intended to be easy to extend and to use, making it a suitable tool fo...... for students, researchers, and industrial users that would like to analyze protocols and/or experiment with different algorithms. This paper presents ASAP from these two perspectives....

  1. Analysis of motion of the three wheeled mobile platform

    Directory of Open Access Journals (Sweden)

    Jaskot Anna

    2018-01-01

    Full Text Available The work is dedicated to the designing motion of the three wheeled mobile platform under the unsteady conditions. In this paper the results of the analysis based on the dynamics model of the three wheeled mobile robot, with two rear wheels and one front wheel has been included The prototype has been developed by the author's construction assumptions that is useful to realize the motion of the platform in a various configurations of wheel drives, including control of the active forces and the direction of their settings while driving. Friction forces, in longitudinal and in the transverse directions, are considered in the proposed model. Relation between friction and active forces are also included. The motion parameters of the mobile platform has been determined by adopting classical approach of mechanics. The formulated initial problem of platform motion has been solved numerically using the Runge-Kutta method of the fourth order. Results of motion analysis with motion parameters values are determined and sample results are presented.

  2. Cluster Analysis of Maize Inbred Lines

    Directory of Open Access Journals (Sweden)

    Jiban Shrestha

    2016-12-01

    Full Text Available The determination of diversity among inbred lines is important for heterosis breeding. Sixty maize inbred lines were evaluated for their eight agro morphological traits during winter season of 2011 to analyze their genetic diversity. Clustering was done by average linkage method. The inbred lines were grouped into six clusters. Inbred lines grouped into Clusters II had taller plants with maximum number of leaves. The cluster III was characterized with shorter plants with minimum number of leaves. The inbred lines categorized into cluster V had early flowering whereas the group into cluster VI had late flowering time. The inbred lines grouped into the cluster III were characterized by higher value of anthesis silking interval (ASI and those of cluster VI had lower value of ASI. These results showed that the inbred lines having widely divergent clusters can be utilized in hybrid breeding programme.

  3. IoT Platforms: Analysis for Building Projects

    Directory of Open Access Journals (Sweden)

    Rusu Liviu DUMITRU

    2017-01-01

    Full Text Available This paper presents a general survey of IoT platforms in terms of features for IoT project de-velopers. I will briefly summarize the state of knowledge in terms of technology regarding “In-ternet of Things” first steps in developing this technology, history, trends, sensors and micro-controllers used. I have evaluated a number of 5 IoT platforms in terms of the features needed to develop a IoT project. I have listed those components that are most appreciated by IoT pro-ject developers and the results have been highlighted in a comparative analysis of these plat-forms from the point of view of IoT project developers and which are strictly necessary as a de-velopment environment for an IoT project based. I’ve also considered the users' views of such platforms in terms of functionality, advantages, disadvantages and dangers presented by this technology.

  4. PIIKA 2: an expanded, web-based platform for analysis of kinome microarray data.

    Directory of Open Access Journals (Sweden)

    Brett Trost

    Full Text Available Kinome microarrays are comprised of peptides that act as phosphorylation targets for protein kinases. This platform is growing in popularity due to its ability to measure phosphorylation-mediated cellular signaling in a high-throughput manner. While software for analyzing data from DNA microarrays has also been used for kinome arrays, differences between the two technologies and associated biologies previously led us to develop Platform for Intelligent, Integrated Kinome Analysis (PIIKA, a software tool customized for the analysis of data from kinome arrays. Here, we report the development of PIIKA 2, a significantly improved version with new features and improvements in the areas of clustering, statistical analysis, and data visualization. Among other additions to the original PIIKA, PIIKA 2 now allows the user to: evaluate statistically how well groups of samples cluster together; identify sets of peptides that have consistent phosphorylation patterns among groups of samples; perform hierarchical clustering analysis with bootstrapping; view false negative probabilities and positive and negative predictive values for t-tests between pairs of samples; easily assess experimental reproducibility; and visualize the data using volcano plots, scatterplots, and interactive three-dimensional principal component analyses. Also new in PIIKA 2 is a web-based interface, which allows users unfamiliar with command-line tools to easily provide input and download the results. Collectively, the additions and improvements described here enhance both the breadth and depth of analyses available, simplify the user interface, and make the software an even more valuable tool for the analysis of kinome microarray data. Both the web-based and stand-alone versions of PIIKA 2 can be accessed via http://saphire.usask.ca.

  5. PIIKA 2: an expanded, web-based platform for analysis of kinome microarray data.

    Science.gov (United States)

    Trost, Brett; Kindrachuk, Jason; Määttänen, Pekka; Napper, Scott; Kusalik, Anthony

    2013-01-01

    Kinome microarrays are comprised of peptides that act as phosphorylation targets for protein kinases. This platform is growing in popularity due to its ability to measure phosphorylation-mediated cellular signaling in a high-throughput manner. While software for analyzing data from DNA microarrays has also been used for kinome arrays, differences between the two technologies and associated biologies previously led us to develop Platform for Intelligent, Integrated Kinome Analysis (PIIKA), a software tool customized for the analysis of data from kinome arrays. Here, we report the development of PIIKA 2, a significantly improved version with new features and improvements in the areas of clustering, statistical analysis, and data visualization. Among other additions to the original PIIKA, PIIKA 2 now allows the user to: evaluate statistically how well groups of samples cluster together; identify sets of peptides that have consistent phosphorylation patterns among groups of samples; perform hierarchical clustering analysis with bootstrapping; view false negative probabilities and positive and negative predictive values for t-tests between pairs of samples; easily assess experimental reproducibility; and visualize the data using volcano plots, scatterplots, and interactive three-dimensional principal component analyses. Also new in PIIKA 2 is a web-based interface, which allows users unfamiliar with command-line tools to easily provide input and download the results. Collectively, the additions and improvements described here enhance both the breadth and depth of analyses available, simplify the user interface, and make the software an even more valuable tool for the analysis of kinome microarray data. Both the web-based and stand-alone versions of PIIKA 2 can be accessed via http://saphire.usask.ca.

  6. Modeling Transfer of Knowledge in an Online Platform of a Cluster

    OpenAIRE

    Schmidt, Danilo Marcello; Böttcher, Lena; Wilberg, Julian; Kammerl, Daniel; Lindemann, Udo

    2016-01-01

    Dealing with knowledge as a relevant resource and factor for production has become increasingly important in the course of globalization. This work focuses on questions about transferring knowledge when many companies work together in a cluster of enterprises. We developed a model of this transfer based on the theory of clusters from the New Institutional Economics’ point of view and based on existing theories about knowledge and knowledge transfer. This theoretical construct is evaluated and...

  7. Ontology-Based Platform for Conceptual Guided Dataset Analysis

    KAUST Repository

    Rodriguez-Garcia, Miguel Angel

    2016-05-31

    Nowadays organizations should handle a huge amount of both internal and external data from structured, semi-structured, and unstructured sources. This constitutes a major challenge (and also an opportunity) to current Business Intelligence solutions. The complexity and effort required to analyse such plethora of data implies considerable execution times. Besides, the large number of data analysis methods and techniques impede domain experts (laymen from an IT-assisted analytics perspective) to fully exploit their potential, while technology experts lack the business background to get the proper questions. In this work, we present a semantically-boosted platform for assisting layman users in (i) extracting a relevant subdataset from all the data, and (ii) selecting the data analysis technique(s) best suited for scrutinising that subdataset. The outcome is getting better answers in significantly less time. The platform has been evaluated in the music domain with promising results.

  8. Cluster analysis of word frequency dynamics

    Science.gov (United States)

    Maslennikova, Yu S.; Bochkarev, V. V.; Belashova, I. A.

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations.

  9. Cluster analysis of word frequency dynamics

    International Nuclear Information System (INIS)

    Maslennikova, Yu S; Bochkarev, V V; Belashova, I A

    2015-01-01

    This paper describes the analysis and modelling of word usage frequency time series. During one of previous studies, an assumption was put forward that all word usage frequencies have uniform dynamics approaching the shape of a Gaussian function. This assumption can be checked using the frequency dictionaries of the Google Books Ngram database. This database includes 5.2 million books published between 1500 and 2008. The corpus contains over 500 billion words in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese. We clustered time series of word usage frequencies using a Kohonen neural network. The similarity between input vectors was estimated using several algorithms. As a result of the neural network training procedure, more than ten different forms of time series were found. They describe the dynamics of word usage frequencies from birth to death of individual words. Different groups of word forms were found to have different dynamics of word usage frequency variations

  10. A Novel Cloud-Based Platform for Implementation of Oblivious Power Routing for Clusters of Microgrids

    DEFF Research Database (Denmark)

    Broojeni, Kianoosh; Amini, M. Hadi; Nejadpak, Arash

    2016-01-01

    is verified by MATLAB simulation. We also present a comprehensive cloud-based platform for further implementation of the proposed algorithm on the OPAL-RT real-time digital simulation system. The communication paths between the microgrids and the cloud environment can be emulated by OMNeT++....

  11. p-tert-Butylcalix[8]arene: an extremely versatile platform for cluster formation.

    Science.gov (United States)

    Taylor, Stephanie M; Sanz, Sergio; McIntosh, Ruaraidh D; Beavers, Christine M; Teat, Simon J; Brechin, Euan K; Dalgarno, Scott J

    2012-12-07

    p-tert-Butylcalix[4]arene is a bowl-shaped molecule capable of forming a range of polynuclear metal clusters under different experimental conditions. p-tert-Butylcalix[8]arene (TBC[8]) is a significantly more flexible analogue that has previously been shown to form mono- and binuclear lanthanide (Ln) metal complexes. The latter (cluster) motif is commonly observed and involves the calixarene adopting a near double-cone conformation, features of which suggested that it may be exploited as a type of assembly node in the formation of larger polynuclear lanthanide clusters. Variation in the experimental conditions employed for this system provides access to Ln(1), Ln(2), Ln(4), Ln(5), Ln(6), Ln(7) and Ln(8) complexes, with all polymetallic clusters containing the common binuclear lanthanide fragment. Closer inspection of the structures of the polymetallic clusters reveals that all but one (Ln(8)) are in fact based on metal octahedra or the building blocks of octahedra, with the identity and size of the final product dependent upon the basicity of the solution and the deprotonation level of the TBC[8] ligand. This demonstrates both the versatility of the ligand towards incorporation of additional metal centres, and the associated implications for tailoring the magnetic properties of the resulting assemblies in which lanthanide centres may be interchanged. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. From virtual clustering analysis to self-consistent clustering analysis: a mathematical study

    Science.gov (United States)

    Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam

    2018-03-01

    In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.

  13. Identification of the chelocardin biosynthetic gene cluster from Amycolatopsis sulphurea: a platform for producing novel tetracycline antibiotics.

    Science.gov (United States)

    Lukežič, Tadeja; Lešnik, Urška; Podgoršek, Ajda; Horvat, Jaka; Polak, Tomaž; Šala, Martin; Jenko, Branko; Raspor, Peter; Herron, Paul R; Hunter, Iain S; Petković, Hrvoje

    2013-12-01

    Tetracyclines (TCs) are medically important antibiotics from the polyketide family of natural products. Chelocardin (CHD), produced by Amycolatopsis sulphurea, is a broad-spectrum tetracyclic antibiotic with potent bacteriolytic activity against a number of Gram-positive and Gram-negative multi-resistant pathogens. CHD has an unknown mode of action that is different from TCs. It has some structural features that define it as 'atypical' and, notably, is active against tetracycline-resistant pathogens. Identification and characterization of the chelocardin biosynthetic gene cluster from A. sulphurea revealed 18 putative open reading frames including a type II polyketide synthase. Compared to typical TCs, the chd cluster contains a number of features that relate to its classification as 'atypical': an additional gene for a putative two-component cyclase/aromatase that may be responsible for the different aromatization pattern, a gene for a putative aminotransferase for C-4 with the opposite stereochemistry to TCs and a gene for a putative C-9 methylase that is a unique feature of this biosynthetic cluster within the TCs. Collectively, these enzymes deliver a molecule with different aromatization of ring C that results in an unusual planar structure of the TC backbone. This is a likely contributor to its different mode of action. In addition CHD biosynthesis is primed with acetate, unlike the TCs, which are primed with malonamate, and offers a biosynthetic engineering platform that represents a unique opportunity for efficient generation of novel tetracyclic backbones using combinatorial biosynthesis.

  14. An analysis of hospital brand mark clusters.

    Science.gov (United States)

    Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan

    2010-07-01

    This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.

  15. Smartness and Italian Cities. A Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Flavio Boscacci

    2014-05-01

    Full Text Available Smart cities have been recently recognized as the most pleasing and attractive places to live in; due to this, both scholars and policy-makers pay close attention to this topic. Specifically, urban “smartness” has been identified by plenty of characteristics that can be grouped into six dimensions (Giffinger et al. 2007: smart Economy (competitiveness, smart People (social and human capital, smart Governance (participation, smart Mobility (both ICTs and transport, smart Environment (natural resources, and smart Living (quality of life. According to this analytical framework, in the present paper the relation between urban attractiveness and the “smart” characteristics has been investigated in the 103 Italian NUTS3 province capitals in the year 2011. To this aim, a descriptive statistics has been followed by a regression analysis (OLS, where the dependent variable measuring the urban attractiveness has been proxied by housing market prices. Besides, a Cluster Analysis (CA has been developed in order to find differences and commonalities among the province capitals.The OLS results indicate that living, people and economy are the key drivers for achieving a better urban attractiveness. Environment, instead, keeps on playing a minor role. Besides, the CA groups the province capitals a

  16. High repeatability from 3D experimental platform for quantitative analysis of cellular branch pattern formations.

    Science.gov (United States)

    Hagiwara, Masaya; Nobata, Rina; Kawahara, Tomohiro

    2018-04-24

    Three-dimensional (3D) cell and tissue cultures more closely mimic biological environments than two-dimensional (2D) cultures and are therefore highly desirable in culture experiments. However, 3D cultures often fail to yield repeatable experimental results because of variation in the initial culture conditions, such as cell density and distribution in the extracellular matrix, and therefore reducing such variation is a paramount concern. Here, we present a 3D culture platform that demonstrates highly repeatable experimental results, obtained by controlling the initial cell cluster shape in the gel cube culture device. A micro-mould with the desired shape was fabricated by photolithography or machining, creating a 3D pocket in the extracellular matrix contained in the device. Highly concentrated human bronchial epithelial cells were then injected in the pocket so that the cell cluster shape matched the fabricated mould shape. Subsequently, the cubic device supplied multi-directional scanning, enabling high-resolution capture of the whole tissue structure with only a low-magnification lens. The proposed device significantly improved the repeatability of the developed branch pattern, and multi-directional scanning enabled quantitative analysis of the developed branch pattern formations. A mathematical simulation was also conducted to reveal the mechanisms of branch pattern formation. The proposed platform offers the potential to accelerate any research field that conducts 3D culture experiments, including tissue regeneration and drug development.

  17. Analysis of human plasma metabolites across different liquid chromatography/mass spectrometry platforms: Cross-platform transferable chemical signatures.

    Science.gov (United States)

    Telu, Kelly H; Yan, Xinjian; Wallace, William E; Stein, Stephen E; Simón-Manso, Yamil

    2016-03-15

    The metabolite profiling of a NIST plasma Standard Reference Material (SRM 1950) on different liquid chromatography/mass spectrometry (LC/MS) platforms showed significant differences. Although these findings suggest caution when interpreting metabolomics results, the degree of overlap of both profiles allowed us to use tandem mass spectral libraries of recurrent spectra to evaluate to what extent these results are transferable across platforms and to develop cross-platform chemical signatures. Non-targeted global metabolite profiles of SRM 1950 were obtained on different LC/MS platforms using reversed-phase chromatography and different chromatographic scales (conventional HPLC, UHPLC and nanoLC). The data processing and the metabolite differential analysis were carried out using publically available (XCMS), proprietary (Mass Profiler Professional) and in-house software (NIST pipeline). Repeatability and intermediate precision showed that the non-targeted SRM 1950 profiling was highly reproducible when working on the same platform (relative standard deviation (RSD) HPLC, UHPLC and nanoLC) on the same platform. A substantial degree of overlap (common molecular features) was also found. A procedure to generate consistent chemical signatures using tandem mass spectral libraries of recurrent spectra is proposed. Different platforms rendered significantly different metabolite profiles, but the results were highly reproducible when working within one platform. Tandem mass spectral libraries of recurrent spectra are proposed to evaluate the degree of transferability of chemical signatures generated on different platforms. Chemical signatures based on our procedure are most likely cross-platform transferable. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.

  18. Emulation Platform for Cyber Analysis of Wireless Communication Network Protocols

    Energy Technology Data Exchange (ETDEWEB)

    Van Leeuwen, Brian P. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Eldridge, John M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-11-01

    Wireless networking and mobile communications is increasing around the world and in all sectors of our lives. With increasing use, the density and complexity of the systems increase with more base stations and advanced protocols to enable higher data throughputs. The security of data transported over wireless networks must also evolve with the advances in technologies enabling more capable wireless networks. However, means for analysis of the effectiveness of security approaches and implementations used on wireless networks are lacking. More specifically a capability to analyze the lower-layer protocols (i.e., Link and Physical layers) is a major challenge. An analysis approach that incorporates protocol implementations without the need for RF emissions is necessary. In this research paper several emulation tools and custom extensions that enable an analysis platform to perform cyber security analysis of lower layer wireless networks is presented. A use case of a published exploit in the 802.11 (i.e., WiFi) protocol family is provided to demonstrate the effectiveness of the described emulation platform.

  19. SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.

    Science.gov (United States)

    Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A

    2018-01-01

    Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.

  20. Taxonomical analysis of the Cancer cluster of galaxies

    International Nuclear Information System (INIS)

    Perea, J.; Olmo, A. del; Moles, M.

    1986-01-01

    A description is presented of the Cancer cluster of galaxies, based on a taxonomical analysis in (α,delta, Vsub(r)) space. Earlier results by previous authors on the lack of dynamical entity of the cluster are confirmed. The present analysis points out the existence of a binary structure in the most populated region of the complex. (author)

  1. Using Cluster Analysis for Data Mining in Educational Technology Research

    Science.gov (United States)

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  2. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    Science.gov (United States)

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…

  3. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  4. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...

  5. A high performance image processing platform based on CPU-GPU heterogeneous cluster with parallel image reconstroctions for micro-CT

    International Nuclear Information System (INIS)

    Ding Yu; Qi Yujin; Zhang Xuezhu; Zhao Cuilan

    2011-01-01

    In this paper, we report the development of a high-performance image processing platform, which is based on CPU-GPU heterogeneous cluster. Currently, it consists of a Dell Precision T7500 and HP XW8600 workstations with parallel programming and runtime environment, using the message-passing interface (MPI) and CUDA (Compute Unified Device Architecture). We succeeded in developing parallel image processing techniques for 3D image reconstruction of X-ray micro-CT imaging. The results show that a GPU provides a computing efficiency of about 194 times faster than a single CPU, and the CPU-GPU clusters provides a computing efficiency of about 46 times faster than the CPU clusters. These meet the requirements of rapid 3D image reconstruction and real time image display. In conclusion, the use of CPU-GPU heterogeneous cluster is an effective way to build high-performance image processing platform. (authors)

  6. Fatigue Analysis of a Mono-Tower Platform

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Brincker, Rune

    In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed thro...... of the natural period, damping ratio, current, stress Spectrum and parameters describing the fatigue strength. Further, soil damping is shown to be significant for the Mono-tower.......In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed...... through linear-elastic fracture mechanics (LEFM). In determining the cumulative fatigue damage, Palmgren-Miner's rule is applied. Element reliability as well as systems reliability is estimated using first-order reliability methods (FORM). The sensitivity of the systems reliability to various parameters...

  7. Fatigue Reliability Analysis of a Mono-Tower Platform

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Brincker, Rune

    1991-01-01

    In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed thro...... of the natural period, damping ratio, current, stress spectrum and parameters describing the fatigue strength. Further, soil damping is shown to be significant for the Mono-tower.......In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed...... through linear-elastic fracture mechanics (LEFM). In determining the cumulative fatigue damage, Palmgren-Miner's rule is applied. Element reliability, as well as systems reliability, is estimated using first-order reliability methods (FORM). The sensitivity of the systems reliability to various parameters...

  8. Universal platform for quantitative analysis of DNA transposition

    Directory of Open Access Journals (Sweden)

    Pajunen Maria I

    2010-11-01

    Full Text Available Abstract Background Completed genome projects have revealed an astonishing diversity of transposable genetic elements, implying the existence of novel element families yet to be discovered from diverse life forms. Concurrently, several better understood transposon systems have been exploited as efficient tools in molecular biology and genomics applications. Characterization of new mobile elements and improvement of the existing transposition technology platforms warrant easy-to-use assays for the quantitative analysis of DNA transposition. Results Here we developed a universal in vivo platform for the analysis of transposition frequency with class II mobile elements, i.e., DNA transposons. For each particular transposon system, cloning of the transposon ends and the cognate transposase gene, in three consecutive steps, generates a multifunctional plasmid, which drives inducible expression of the transposase gene and includes a mobilisable lacZ-containing reporter transposon. The assay scores transposition events as blue microcolonies, papillae, growing within otherwise whitish Escherichia coli colonies on indicator plates. We developed the assay using phage Mu transposition as a test model and validated the platform using various MuA transposase mutants. For further validation and to illustrate universality, we introduced IS903 transposition system components into the assay. The developed assay is adjustable to a desired level of initial transposition via the control of a plasmid-borne E. coli arabinose promoter. In practice, the transposition frequency is modulated by varying the concentration of arabinose or glucose in the growth medium. We show that variable levels of transpositional activity can be analysed, thus enabling straightforward screens for hyper- or hypoactive transposase mutants, regardless of the original wild-type activity level. Conclusions The established universal papillation assay platform should be widely applicable to a

  9. Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

    Science.gov (United States)

    Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

    2017-01-01

    Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.

  10. The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

    Science.gov (United States)

    Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

    2017-07-01

    Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.

  11. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    DEFF Research Database (Denmark)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-01-01

    -friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical...... displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics...

  12. Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.

    Science.gov (United States)

    Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun

    2017-12-01

    Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.

  13. Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles

    Science.gov (United States)

    Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny

    2017-10-01

    Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.

  14. Advances in the development of the Mexican platform for analysis and design of nuclear reactors: AZTLAN Platform

    International Nuclear Information System (INIS)

    Gomez T, A. M.; Puente E, F.; Del Valle G, E.; Francois L, J. L.; Espinosa P, G.

    2017-09-01

    The AZTLAN platform project: development of a Mexican platform for the analysis and design of nuclear reactors, financed by the SENER-CONACYT Energy Sustain ability Fund, was approved in early 2014 and formally began at the end of that year. It is a national project led by the Instituto Nacional de Investigaciones Nucleares (ININ) and with the collaboration of Instituto Politecnico Nacional (IPN), the Universidad Autonoma Metropolitana (UAM) and Universidad Nacional Autonoma de Mexico (UNAM) as part of the development team and with the participation of the Laguna Verde Nuclear Power Plant, the National Commission of Nuclear Safety and Safeguards, the Ministry of Energy and the Karlsruhe Institute of Technology (Kit, Germany) as part of the user group. The general objective of the project is to modernize, improve and integrate the neutronic, thermo-hydraulic and thermo-mechanical codes, developed in Mexican institutions, in an integrated platform, developed and maintained by Mexican experts for the benefit of Mexican institutions. Two years into the process, important steps have been taken that have consolidated the platform. The main results of these first two years have been presented in different national and international forums. In this congress, some of the most recent results that have been implemented in the platform codes are shown in more detail. The current status of the platform from a more executive view point is summarized in this paper. (Author)

  15. Clustering of users of digital libraries through log file analysis

    Directory of Open Access Journals (Sweden)

    Juan Antonio Martínez-Comeche

    2017-09-01

    Full Text Available This study analyzes how users perform information retrieval tasks when introducing queries to the Hispanic Digital Library. Clusters of users are differentiated based on their distinct information behavior. The study used the log files collected by the server over a year and different possible clustering algorithms are compared. The k-means algorithm is found to be a suitable clustering method for the analysis of large log files from digital libraries. In the case of the Hispanic Digital Library the results show three clusters of users and the characteristic information behavior of each group is described.

  16. Design of a Golf Swing Injury Detection and Evaluation open service platform with Ontology-oriented clustering case-based reasoning mechanism.

    Science.gov (United States)

    Ku, Hao-Hsiang

    2015-01-01

    Nowadays, people can easily use a smartphone to get wanted information and requested services. Hence, this study designs and proposes a Golf Swing Injury Detection and Evaluation open service platform with Ontology-oritened clustering case-based reasoning mechanism, which is called GoSIDE, based on Arduino and Open Service Gateway initative (OSGi). GoSIDE is a three-tier architecture, which is composed of Mobile Users, Application Servers and a Cloud-based Digital Convergence Server. A mobile user is with a smartphone and Kinect sensors to detect the user's Golf swing actions and to interact with iDTV. An application server is with Intelligent Golf Swing Posture Analysis Model (iGoSPAM) to check a user's Golf swing actions and to alter this user when he is with error actions. Cloud-based Digital Convergence Server is with Ontology-oriented Clustering Case-based Reasoning (CBR) for Quality of Experiences (OCC4QoE), which is designed to provide QoE services by QoE-based Ontology strategies, rules and events for this user. Furthermore, GoSIDE will automatically trigger OCC4QoE and deliver popular rules for a new user. Experiment results illustrate that GoSIDE can provide appropriate detections for Golfers. Finally, GoSIDE can be a reference model for researchers and engineers.

  17. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    Science.gov (United States)

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  18. A SURVEY ON DOCUMENT CLUSTERING APPROACH FOR COMPUTER FORENSIC ANALYSIS

    OpenAIRE

    Monika Raghuvanshi*, Rahul Patel

    2016-01-01

    In a forensic analysis, large numbers of files are examined. Much of the information comprises of in unstructured format, so it’s quite difficult task for computer forensic to perform such analysis. That’s why to do the forensic analysis of document within a limited period of time require a special approach such as document clustering. This paper review different document clustering algorithms methodologies for example K-mean, K-medoid, single link, complete link, average link in accorandance...

  19. PIVOT: platform for interactive analysis and visualization of transcriptomics data.

    Science.gov (United States)

    Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong

    2018-01-05

    Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.

  20. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  1. VAAPA: a web platform for visualization and analysis of alternative polyadenylation.

    Science.gov (United States)

    Guan, Jinting; Fu, Jingyi; Wu, Mingcheng; Chen, Longteng; Ji, Guoli; Quinn Li, Qingshun; Wu, Xiaohui

    2015-02-01

    Polyadenylation [poly(A)] is an essential process during the maturation of most mRNAs in eukaryotes. Alternative polyadenylation (APA) as an important layer of gene expression regulation has been increasingly recognized in various species. Here, a web platform for visualization and analysis of alternative polyadenylation (VAAPA) was developed. This platform can visualize the distribution of poly(A) sites and poly(A) clusters of a gene or a section of a chromosome. It can also highlight genes with switched APA sites among different conditions. VAAPA is an easy-to-use web-based tool that provides functions of poly(A) site query, data uploading, downloading, and APA sites visualization. It was designed in a multi-tier architecture and developed based on Smart GWT (Google Web Toolkit) using Java as the development language. VAAPA will be a valuable addition to the community for the comprehensive study of APA, not only by making the high quality poly(A) site data more accessible, but also by providing users with numerous valuable functions for poly(A) site analysis and visualization. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. LeARN: a platform for detecting, clustering and annotating non-coding RNAs

    Directory of Open Access Journals (Sweden)

    Schiex Thomas

    2008-01-01

    Full Text Available Abstract Background In the last decade, sequencing projects have led to the development of a number of annotation systems dedicated to the structural and functional annotation of protein-coding genes. These annotation systems manage the annotation of the non-protein coding genes (ncRNAs in a very crude way, allowing neither the edition of the secondary structures nor the clustering of ncRNA genes into families which are crucial for appropriate annotation of these molecules. Results LeARN is a flexible software package which handles the complete process of ncRNA annotation by integrating the layers of automatic detection and human curation. Conclusion This software provides the infrastructure to deal properly with ncRNAs in the framework of any annotation project. It fills the gap between existing prediction software, that detect independent ncRNA occurrences, and public ncRNA repositories, that do not offer the flexibility and interactivity required for annotation projects. The software is freely available from the download section of the website http://bioinfo.genopole-toulouse.prd.fr/LeARN

  3. Merging Galaxy Clusters: Analysis of Simulated Analogs

    Science.gov (United States)

    Nguyen, Jayke; Wittman, David; Cornell, Hunter

    2018-01-01

    The nature of dark matter can be better constrained by observing merging galaxy clusters. However, uncertainty in the viewing angle leads to uncertainty in dynamical quantities such as 3-d velocities, 3-d separations, and time since pericenter. The classic timing argument links these quantities via equations of motion, but neglects effects of nonzero impact parameter (i.e. it assumes velocities are parallel to the separation vector), dynamical friction, substructure, and larger-scale environment. We present a new approach using n-body cosmological simulations that naturally incorporate these effects. By uniformly sampling viewing angles about simulated cluster analogs, we see projected merger parameters in the many possible configurations of a given cluster. We select comparable simulated analogs and evaluate the likelihood of particular merger parameters as a function of viewing angle. We present viewing angle constraints for a sample of observed mergers including the Bullet cluster and El Gordo, and show that the separation vectors are closer to the plane of the sky than previously reported.

  4. Regular and platform switching: bone stress analysis varying implant type.

    Science.gov (United States)

    Gurgel-Juarez, Nália Cecília; de Almeida, Erika Oliveira; Rocha, Eduardo Passos; Freitas, Amílcar Chagas; Anchieta, Rodolfo Bruniera; de Vargas, Luis Carlos Merçon; Kina, Sidney; França, Fabiana Mantovani Gomes

    2012-04-01

    This study aimed to evaluate stress distribution on peri-implant bone simulating the influence of platform switching in external and internal hexagon implants using three-dimensional finite element analysis. Four mathematical models of a central incisor supported by an implant were created: External Regular model (ER) with 5.0 mm × 11.5 mm external hexagon implant and 5.0 mm abutment (0% abutment shifting), Internal Regular model (IR) with 4.5 mm × 11.5 mm internal hexagon implant and 4.5 mm abutment (0% abutment shifting), External Switching model (ES) with 5.0 mm × 11.5 mm external hexagon implant and 4.1 mm abutment (18% abutment shifting), and Internal Switching model (IS) with 4.5 mm × 11.5 mm internal hexagon implant and 3.8 mm abutment (15% abutment shifting). The models were created by SolidWorks software. The numerical analysis was performed using ANSYS Workbench. Oblique forces (100 N) were applied to the palatal surface of the central incisor. The maximum (σ(max)) and minimum (σ(min)) principal stress, equivalent von Mises stress (σ(vM)), and maximum principal elastic strain (ε(max)) values were evaluated for the cortical and trabecular bone. For cortical bone, the highest stress values (σ(max) and σ(vm) ) (MPa) were observed in IR (87.4 and 82.3), followed by IS (83.3 and 72.4), ER (82 and 65.1), and ES (56.7 and 51.6). For ε(max), IR showed the highest stress (5.46e-003), followed by IS (5.23e-003), ER (5.22e-003), and ES (3.67e-003). For the trabecular bone, the highest stress values (σ(max)) (MPa) were observed in ER (12.5), followed by IS (12), ES (11.9), and IR (4.95). For σ(vM), the highest stress values (MPa) were observed in IS (9.65), followed by ER (9.3), ES (8.61), and IR (5.62). For ε(max) , ER showed the highest stress (5.5e-003), followed by ES (5.43e-003), IS (3.75e-003), and IR (3.15e-003). The influence of platform switching was more evident for cortical bone than for trabecular bone, mainly for the external hexagon

  5. Automics: an integrated platform for NMR-based metabonomics spectral processing and data analysis

    Directory of Open Access Journals (Sweden)

    Qu Lijia

    2009-03-01

    Full Text Available Abstract Background Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public. Results In this paper, we report an open source software tool, Automics, which is specifically designed for NMR-based metabonomics studies. Automics is a highly integrated platform that provides functions covering almost all the stages of NMR-based metabonomics studies. Automics provides high throughput automatic modules with most recently proposed algorithms and powerful manual modules for 1D NMR spectral processing. In addition to spectral processing functions, powerful features for data organization, data pre-processing, and data analysis have been implemented. Nine statistical methods can be applied to analyses including: feature selection (Fisher's criterion, data reduction (PCA, LDA, ULDA, unsupervised clustering (K-Mean and supervised regression and classification (PLS/PLS-DA, KNN, SIMCA, SVM. Moreover, Automics has a user-friendly graphical interface for visualizing NMR spectra and data analysis results. The functional ability of Automics is demonstrated with an analysis of a type 2 diabetes metabolic profile. Conclusion Automics facilitates high throughput 1D NMR spectral processing and high dimensional data analysis for NMR-based metabonomics applications. Using Automics, users can complete spectral processing and data analysis within one software package in most cases

  6. Automics: an integrated platform for NMR-based metabonomics spectral processing and data analysis.

    Science.gov (United States)

    Wang, Tao; Shao, Kang; Chu, Qinying; Ren, Yanfei; Mu, Yiming; Qu, Lijia; He, Jie; Jin, Changwen; Xia, Bin

    2009-03-16

    Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public. In this paper, we report an open source software tool, Automics, which is specifically designed for NMR-based metabonomics studies. Automics is a highly integrated platform that provides functions covering almost all the stages of NMR-based metabonomics studies. Automics provides high throughput automatic modules with most recently proposed algorithms and powerful manual modules for 1D NMR spectral processing. In addition to spectral processing functions, powerful features for data organization, data pre-processing, and data analysis have been implemented. Nine statistical methods can be applied to analyses including: feature selection (Fisher's criterion), data reduction (PCA, LDA, ULDA), unsupervised clustering (K-Mean) and supervised regression and classification (PLS/PLS-DA, KNN, SIMCA, SVM). Moreover, Automics has a user-friendly graphical interface for visualizing NMR spectra and data analysis results. The functional ability of Automics is demonstrated with an analysis of a type 2 diabetes metabolic profile. Automics facilitates high throughput 1D NMR spectral processing and high dimensional data analysis for NMR-based metabonomics applications. Using Automics, users can complete spectral processing and data analysis within one software package in most cases. Moreover, with its open source architecture, interested

  7. Platforms for Single-Cell Collection and Analysis

    Directory of Open Access Journals (Sweden)

    Lukas Valihrach

    2018-03-01

    Full Text Available Single-cell analysis has become an established method to study cell heterogeneity and for rare cell characterization. Despite the high cost and technical constraints, applications are increasing every year in all fields of biology. Following the trend, there is a tremendous development of tools for single-cell analysis, especially in the RNA sequencing field. Every improvement increases sensitivity and throughput. Collecting a large amount of data also stimulates the development of new approaches for bioinformatic analysis and interpretation. However, the essential requirement for any analysis is the collection of single cells of high quality. The single-cell isolation must be fast, effective, and gentle to maintain the native expression profiles. Classical methods for single-cell isolation are micromanipulation, microdissection, and fluorescence-activated cell sorting (FACS. In the last decade several new and highly efficient approaches have been developed, which not just supplement but may fully replace the traditional ones. These new techniques are based on microfluidic chips, droplets, micro-well plates, and automatic collection of cells using capillaries, magnets, an electric field, or a punching probe. In this review we summarize the current methods and developments in this field. We discuss the advantages of the different commercially available platforms and their applicability, and also provide remarks on future developments.

  8. Platforms for Single-Cell Collection and Analysis.

    Science.gov (United States)

    Valihrach, Lukas; Androvic, Peter; Kubista, Mikael

    2018-03-11

    Single-cell analysis has become an established method to study cell heterogeneity and for rare cell characterization. Despite the high cost and technical constraints, applications are increasing every year in all fields of biology. Following the trend, there is a tremendous development of tools for single-cell analysis, especially in the RNA sequencing field. Every improvement increases sensitivity and throughput. Collecting a large amount of data also stimulates the development of new approaches for bioinformatic analysis and interpretation. However, the essential requirement for any analysis is the collection of single cells of high quality. The single-cell isolation must be fast, effective, and gentle to maintain the native expression profiles. Classical methods for single-cell isolation are micromanipulation, microdissection, and fluorescence-activated cell sorting (FACS). In the last decade several new and highly efficient approaches have been developed, which not just supplement but may fully replace the traditional ones. These new techniques are based on microfluidic chips, droplets, micro-well plates, and automatic collection of cells using capillaries, magnets, an electric field, or a punching probe. In this review we summarize the current methods and developments in this field. We discuss the advantages of the different commercially available platforms and their applicability, and also provide remarks on future developments.

  9. Geospatial analysis platform: Supporting strategic spatial analysis and planning

    CSIR Research Space (South Africa)

    Naude, A

    2008-11-01

    Full Text Available Whilst there have been rapid advances in satellite imagery and related fine resolution mapping and web-based interfaces (e.g. Google Earth), the development of capabilities for strategic spatial analysis and planning support has lagged behind...

  10. Analysis of Aspects of Innovation in a Brazilian Cluster

    Directory of Open Access Journals (Sweden)

    Adriana Valélia Saraceni

    2012-09-01

    Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.

  11. A Flocking Based algorithm for Document Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  12. Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

    Science.gov (United States)

    Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

    2018-04-01

    Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

  13. Network Analysis Tools: from biological networks to clusters and pathways.

    Science.gov (United States)

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques

    2008-01-01

    Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.

  14. RadMAP: The Radiological Multi-sensor Analysis Platform

    International Nuclear Information System (INIS)

    Bandstra, Mark S.; Aucott, Timothy J.; Brubaker, Erik; Chivers, Daniel H.; Cooper, Reynold J.; Curtis, Joseph C.; Davis, John R.; Joshi, Tenzing H.; Kua, John; Meyer, Ross; Negut, Victor; Quinlan, Michael; Quiter, Brian J.; Srinivasan, Shreyas; Zakhor, Avideh; Zhang, Richard; Vetter, Kai

    2016-01-01

    The variability of gamma-ray and neutron background during the operation of a mobile detector system greatly limits the ability of the system to detect weak radiological and nuclear threats. The natural radiation background measured by a mobile detector system is the result of many factors, including the radioactivity of nearby materials, the geometric configuration of those materials and the system, the presence of absorbing materials, and atmospheric conditions. Background variations tend to be highly non-Poissonian, making it difficult to set robust detection thresholds using knowledge of the mean background rate alone. The Radiological Multi-sensor Analysis Platform (RadMAP) system is designed to allow the systematic study of natural radiological background variations and to serve as a development platform for emerging concepts in mobile radiation detection and imaging. To do this, RadMAP has been used to acquire extensive, systematic background measurements and correlated contextual data that can be used to test algorithms and detector modalities at low false alarm rates. By combining gamma-ray and neutron detector systems with data from contextual sensors, the system enables the fusion of data from multiple sensors into novel data products. The data are curated in a common format that allows for rapid querying across all sensors, creating detailed multi-sensor datasets that are used to study correlations between radiological and contextual data, and develop and test novel techniques in mobile detection and imaging. In this paper we will describe the instruments that comprise the RadMAP system, the effort to curate and provide access to multi-sensor data, and some initial results on the fusion of contextual and radiological data.

  15. RadMAP: The Radiological Multi-sensor Analysis Platform

    Energy Technology Data Exchange (ETDEWEB)

    Bandstra, Mark S., E-mail: msbandstra@lbl.gov [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Aucott, Timothy J. [Department of Nuclear Engineering, University of California Berkeley, CA (United States); Brubaker, Erik [Sandia National Laboratory, Livermore, CA (United States); Chivers, Daniel H.; Cooper, Reynold J. [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Curtis, Joseph C. [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Department of Nuclear Engineering, University of California Berkeley, CA (United States); Davis, John R. [Department of Nuclear Engineering, University of California Berkeley, CA (United States); Joshi, Tenzing H.; Kua, John; Meyer, Ross; Negut, Victor; Quinlan, Michael; Quiter, Brian J. [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Srinivasan, Shreyas [Department of Nuclear Engineering, University of California Berkeley, CA (United States); Department of Electrical Engineering and Computer Science, University of California Berkeley, CA (United States); Zakhor, Avideh; Zhang, Richard [Department of Electrical Engineering and Computer Science, University of California Berkeley, CA (United States); Vetter, Kai [Nuclear Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States); Department of Nuclear Engineering, University of California Berkeley, CA (United States)

    2016-12-21

    The variability of gamma-ray and neutron background during the operation of a mobile detector system greatly limits the ability of the system to detect weak radiological and nuclear threats. The natural radiation background measured by a mobile detector system is the result of many factors, including the radioactivity of nearby materials, the geometric configuration of those materials and the system, the presence of absorbing materials, and atmospheric conditions. Background variations tend to be highly non-Poissonian, making it difficult to set robust detection thresholds using knowledge of the mean background rate alone. The Radiological Multi-sensor Analysis Platform (RadMAP) system is designed to allow the systematic study of natural radiological background variations and to serve as a development platform for emerging concepts in mobile radiation detection and imaging. To do this, RadMAP has been used to acquire extensive, systematic background measurements and correlated contextual data that can be used to test algorithms and detector modalities at low false alarm rates. By combining gamma-ray and neutron detector systems with data from contextual sensors, the system enables the fusion of data from multiple sensors into novel data products. The data are curated in a common format that allows for rapid querying across all sensors, creating detailed multi-sensor datasets that are used to study correlations between radiological and contextual data, and develop and test novel techniques in mobile detection and imaging. In this paper we will describe the instruments that comprise the RadMAP system, the effort to curate and provide access to multi-sensor data, and some initial results on the fusion of contextual and radiological data.

  16. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform.

    Science.gov (United States)

    Hanwell, Marcus D; Curtis, Donald E; Lonie, David C; Vandermeersch, Tim; Zurek, Eva; Hutchison, Geoffrey R

    2012-08-13

    The Avogadro project has developed an advanced molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible, high quality rendering, and a powerful plugin architecture. Typical uses include building molecular structures, formatting input files, and analyzing output of a wide variety of computational chemistry packages. By using the CML file format as its native document type, Avogadro seeks to enhance the semantic accessibility of chemical data types. The work presented here details the Avogadro library, which is a framework providing a code library and application programming interface (API) with three-dimensional visualization capabilities; and has direct applications to research and education in the fields of chemistry, physics, materials science, and biology. The Avogadro application provides a rich graphical interface using dynamically loaded plugins through the library itself. The application and library can each be extended by implementing a plugin module in C++ or Python to explore different visualization techniques, build/manipulate molecular structures, and interact with other programs. We describe some example extensions, one which uses a genetic algorithm to find stable crystal structures, and one which interfaces with the PackMol program to create packed, solvated structures for molecular dynamics simulations. The 1.0 release series of Avogadro is the main focus of the results discussed here. Avogadro offers a semantic chemical builder and platform for visualization and analysis. For users, it offers an easy-to-use builder, integrated support for downloading from common databases such as PubChem and the Protein Data Bank, extracting chemical data from a wide variety of formats, including computational chemistry output, and native, semantic support for the CML file format. For developers, it can be easily extended via a powerful

  17. Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia

    Directory of Open Access Journals (Sweden)

    Nazarudin Safian

    2008-09-01

    Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS

  18. Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

    Science.gov (United States)

    de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

    2006-01-01

    K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…

  19. End-to-end integrated security and performance analysis on the DEGAS Choreographer platform

    DEFF Research Database (Denmark)

    Buchholtz, Mikael; Gilmore, Stephen; Haenel, Valentin

    2005-01-01

    We present a software tool platform which facilitates security and performance analysis of systems which starts and ends with UML model descriptions. A UML project is presented to the platform for analysis, formal content is extracted in the form of process calculi descriptions, analysed with the......We present a software tool platform which facilitates security and performance analysis of systems which starts and ends with UML model descriptions. A UML project is presented to the platform for analysis, formal content is extracted in the form of process calculi descriptions, analysed...

  20. Using cluster analysis to organize and explore regional GPS velocities

    Science.gov (United States)

    Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

    2012-01-01

    Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.

  1. Platform computing

    CERN Multimedia

    2002-01-01

    "Platform Computing releases first grid-enabled workload management solution for IBM eServer Intel and UNIX high performance computing clusters. This Out-of-the-box solution maximizes the performance and capability of applications on IBM HPC clusters" (1/2 page) .

  2. Multivalent adhesion molecule 7 clusters act as signaling platform for host cellular GTPase activation and facilitate epithelial barrier dysfunction.

    Directory of Open Access Journals (Sweden)

    Jenson Lim

    2014-09-01

    Full Text Available Vibrio parahaemolyticus is an emerging bacterial pathogen which colonizes the gastrointestinal tract and can cause severe enteritis and bacteraemia. During infection, V. parahaemolyticus primarily attaches to the small intestine, where it causes extensive tissue damage and compromises epithelial barrier integrity. We have previously described that Multivalent Adhesion Molecule (MAM 7 contributes to initial attachment of V. parahaemolyticus to epithelial cells. Here we show that the bacterial adhesin, through multivalent interactions between surface-induced adhesin clusters and phosphatidic acid lipids in the host cell membrane, induces activation of the small GTPase RhoA and actin rearrangements in host cells. In infection studies with V. parahaemolyticus we further demonstrate that adhesin-triggered activation of the ROCK/LIMK signaling axis is sufficient to redistribute tight junction proteins, leading to a loss of epithelial barrier function. Taken together, these findings show an unprecedented mechanism by which an adhesin acts as assembly platform for a host cellular signaling pathway, which ultimately facilitates breaching of the epithelial barrier by a bacterial pathogen.

  3. A multilevel Lab on chip platform for DNA analysis.

    Science.gov (United States)

    Marasso, Simone Luigi; Giuri, Eros; Canavese, Giancarlo; Castagna, Riccardo; Quaglio, Marzia; Ferrante, Ivan; Perrone, Denis; Cocuzza, Matteo

    2011-02-01

    Lab-on-chips (LOCs) are critical systems that have been introduced to speed up and reduce the cost of traditional, laborious and extensive analyses in biological and biomedical fields. These ambitious and challenging issues ask for multi-disciplinary competences that range from engineering to biology. Starting from the aim to integrate microarray technology and microfluidic devices, a complex multilevel analysis platform has been designed, fabricated and tested (All rights reserved-IT Patent number TO2009A000915). This LOC successfully manages to interface microfluidic channels with standard DNA microarray glass slides, in order to implement a complete biological protocol. Typical Micro Electro Mechanical Systems (MEMS) materials and process technologies were employed. A silicon/glass microfluidic chip and a Polydimethylsiloxane (PDMS) reaction chamber were fabricated and interfaced with a standard microarray glass slide. In order to have a high disposable system all micro-elements were passive and an external apparatus provided fluidic driving and thermal control. The major microfluidic and handling problems were investigated and innovative solutions were found. Finally, an entirely automated DNA hybridization protocol was successfully tested with a significant reduction in analysis time and reagent consumption with respect to a conventional protocol.

  4. GWATCH: a web platform for automated gene association discovery analysis

    Science.gov (United States)

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  5. LOOS: an extensible platform for the structural analysis of simulations.

    Science.gov (United States)

    Romo, Tod D; Grossfield, Alan

    2009-01-01

    We have developed LOOS (Lightweight Object-Oriented Structure-analysis library) as an object-oriented library designed to facilitate the rapid development of tools for the structural analysis of simulations. LOOS supports the native file formats of most common simulation packages including AMBER, CHARMM, CNS, Gromacs, NAMD, Tinker, and X-PLOR. Encapsulation and polymorphism are used to simultaneously provide a stable interface to the programmer and make LOOS easily extensible. A rich atom selection language based on the C expression syntax is included as part of the library. LOOS enables students and casual programmer-scientists to rapidly write their own analytical tools in a compact and expressive manner resembling scripting. LOOS is written in C++ and makes extensive use of the Standard Template Library and Boost, and is freely available under the GNU General Public License (version 3) LOOS has been tested on Linux and MacOS X, but is written to be portable and should work on most Unix-based platforms.

  6. PRANAS: A New Platform for Retinal Analysis and Simulation

    Directory of Open Access Journals (Sweden)

    Bruno Cessac

    2017-09-01

    Full Text Available The retina encodes visual scenes by trains of action potentials that are sent to the brain via the optic nerve. In this paper, we describe a new free access user-end software allowing to better understand this coding. It is called PRANAS (https://pranas.inria.fr, standing for Platform for Retinal ANalysis And Simulation. PRANAS targets neuroscientists and modelers by providing a unique set of retina-related tools. PRANAS integrates a retina simulator allowing large scale simulations while keeping a strong biological plausibility and a toolbox for the analysis of spike train population statistics. The statistical method (entropy maximization under constraints takes into account both spatial and temporal correlations as constraints, allowing to analyze the effects of memory on statistics. PRANAS also integrates a tool computing and representing in 3D (time-space receptive fields. All these tools are accessible through a friendly graphical user interface. The most CPU-costly of them have been implemented to run in parallel.

  7. GECKO: a complete large-scale gene expression analysis platform

    Directory of Open Access Journals (Sweden)

    Heuer Michael

    2004-12-01

    Full Text Available Abstract Background Gecko (Gene Expression: Computation and Knowledge Organization is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community. Results Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing ~ 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph, in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (~ 100 users and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data. Conclusions The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.

  8. Facies analysis of an Upper Jurassic carbonate platform for geothermal reservoir characterization

    Science.gov (United States)

    von Hartmann, Hartwig; Buness, Hermann; Dussel, Michael

    2017-04-01

    The Upper Jurassic Carbonate platform in Southern Germany is an important aquifer for the production of geothermal energy. Several successful projects were realized during the last years. 3D-seismic surveying has been established as a standard method for reservoir analysis and the definition of well paths. A project funded by the federal ministry of economic affairs and energy (BMWi) started in 2015 is a milestone for an exclusively regenerative heat energy supply of Munich. A 3D-seismic survey of 170 square kilometer was acquired and a scientific program was established to analyze the facies distribution within the area (http://www.liag-hannover.de/en/fsp/ge/geoparamol.html). Targets are primarily fault zones where one expect higher flow rates than within the undisturbed carbonate sediments. However, since a dense net of geothermal plants and wells will not always find appropriate fault areas, the reservoir properties should be analyzed in more detail, e.g. changing the viewpoint to karst features and facies distribution. Actual facies interpretation concepts are based on the alternation of massif and layered carbonates. Because of successive erosion of the ancient land surfaces, the interpretation of reefs, being an important target, is often difficult. We found that seismic sequence stratigraphy can explain the distribution of seismic pattern and improves the analysis of different facies. We supported this method by applying wavelet transformation of seismic data. The splitting of the seismic signal into successive parts of different bandwidths, especially the frequency content of the seismic signal, changed by tuning or dispersion, is extracted. The combination of different frequencies reveals a partition of the platform laterally as well as vertically. A cluster analysis of the wavelet coefficients further improves this picture. The interpretation shows a division into ramp, inner platform and trough, which were shifted locally and overprinted in time by other

  9. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  10. A Distributed Flocking Approach for Information Stream Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  11. Cluster analysis of clinical data identifies fibromyalgia subgroups.

    Directory of Open Access Journals (Sweden)

    Elisa Docampo

    Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.

  12. Clustering Trajectories by Relevant Parts for Air Traffic Analysis.

    Science.gov (United States)

    Andrienko, Gennady; Andrienko, Natalia; Fuchs, Georg; Garcia, Jose Manuel Cordero

    2018-01-01

    Clustering of trajectories of moving objects by similarity is an important technique in movement analysis. Existing distance functions assess the similarity between trajectories based on properties of the trajectory points or segments. The properties may include the spatial positions, times, and thematic attributes. There may be a need to focus the analysis on certain parts of trajectories, i.e., points and segments that have particular properties. According to the analysis focus, the analyst may need to cluster trajectories by similarity of their relevant parts only. Throughout the analysis process, the focus may change, and different parts of trajectories may become relevant. We propose an analytical workflow in which interactive filtering tools are used to attach relevance flags to elements of trajectories, clustering is done using a distance function that ignores irrelevant elements, and the resulting clusters are summarized for further analysis. We demonstrate how this workflow can be useful for different analysis tasks in three case studies with real data from the domain of air traffic. We propose a suite of generic techniques and visualization guidelines to support movement data analysis by means of relevance-aware trajectory clustering.

  13. Cluster analysis of Southeastern U.S. climate stations

    Science.gov (United States)

    Stooksbury, D. E.; Michaels, P. J.

    1991-09-01

    A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.

  14. How can design be a platform for the development of a regional cluster in the Region of Southern Denmark

    DEFF Research Database (Denmark)

    Jensen, Susanne; Christensen, Poul Rind

    2013-01-01

    Analyses of key factors for the emergence of a cluster and the formation of a design cluster in the region of Southern Denmark......Analyses of key factors for the emergence of a cluster and the formation of a design cluster in the region of Southern Denmark...

  15. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform

    Directory of Open Access Journals (Sweden)

    Hanwell Marcus D

    2012-08-01

    Full Text Available Abstract Background The Avogadro project has developed an advanced molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible, high quality rendering, and a powerful plugin architecture. Typical uses include building molecular structures, formatting input files, and analyzing output of a wide variety of computational chemistry packages. By using the CML file format as its native document type, Avogadro seeks to enhance the semantic accessibility of chemical data types. Results The work presented here details the Avogadro library, which is a framework providing a code library and application programming interface (API with three-dimensional visualization capabilities; and has direct applications to research and education in the fields of chemistry, physics, materials science, and biology. The Avogadro application provides a rich graphical interface using dynamically loaded plugins through the library itself. The application and library can each be extended by implementing a plugin module in C++ or Python to explore different visualization techniques, build/manipulate molecular structures, and interact with other programs. We describe some example extensions, one which uses a genetic algorithm to find stable crystal structures, and one which interfaces with the PackMol program to create packed, solvated structures for molecular dynamics simulations. The 1.0 release series of Avogadro is the main focus of the results discussed here. Conclusions Avogadro offers a semantic chemical builder and platform for visualization and analysis. For users, it offers an easy-to-use builder, integrated support for downloading from common databases such as PubChem and the Protein Data Bank, extracting chemical data from a wide variety of formats, including computational chemistry output, and native, semantic support for the CML file format

  16. Design and development of a prototype platform for gait analysis

    Science.gov (United States)

    Diffenbaugh, T. E.; Marti, M. A.; Jagani, J.; Garcia, V.; Iliff, G. J.; Phoenix, A.; Woolard, A. G.; Malladi, V. V. N. S.; Bales, D. B.; Tarazaga, P. A.

    2017-04-01

    The field of event classification and localization in building environments using accelerometers has grown significantly due to its implications for energy, security, and emergency protocols. Virginia Tech's Goodwin Hall (VT-GH) provides a robust testbed for such work, but a reduced scale testbed could provide significant benefits by allowing algorithm development to occur in a simplified environment. Environments such as VT-GH have high human traffic that contributes external noise disrupting test signals. This paper presents a design solution through the development of an isolated platform for data collection, portable demonstrations, and the development of localization and classification algorithms. The platform's success was quantified by the resulting transmissibility of external excitation sources, demonstrating the capabilities of the platform to isolate external disturbances while preserving gait information. This platform demonstrates the collection of high-quality gait information in otherwise noisy environments for data collection or demonstration purposes.

  17. Development of small scale cluster computer for numerical analysis

    Science.gov (United States)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

    2017-09-01

    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  18. Analysis of the development of cross-platform mobile applications

    OpenAIRE

    Pinedo Escribano, Diego

    2012-01-01

    The development of mobile phone applications is a huge market nowadays. There are many companies investing lot of money to develop successful and profitable applications. The problem emerges when trying to develop an application to be used by every user independently of the platform they are using (Android, iOS, BlackBerry OS, Windows Phone, etc.). For this reason, on the last years many different technologies have appeared that making the development of cross-platform applications easier. In...

  19. Use of offshore mooring platform for sea wave motion analysis

    International Nuclear Information System (INIS)

    Cicconi, G.; Dagnino, I.; Papa, L.

    1979-01-01

    An offshore mooring platform for supertankers may often turn out to be an ideal solution for the problem of installing a meteorological station. Its location may be particularly desirable for the purpose of recording and analysing sea wave motion in deep water or in the intermediate zone between shallow and deep water. The preliminary results obtained through the operation of a subsurface sensor at the mooring platform off the harbour of Genova are reported. (author)

  20. Use of offshore mooring platform for sea wave motion analysis

    Energy Technology Data Exchange (ETDEWEB)

    Cicconi, G.; Dagnino, I.; Papa, L. (Genova Univ. (Italy). Ist. Geofisica e Geodetico); Basano, L.; Ottonello, P. (Genoa Univ. (Italy))

    An offshore mooring platform for supertankers may often turn out to be an ideal solution for the problem of installing a meteorological station. Its location may be particularly desirable for the purpose of recording and analysing sea wave motion in deep water or in the intermediate zone between shallow and deep water. The preliminary results obtained through the operation of a subsurface sensor at the mooring platform off the harbour of Genova are reported.

  1. Predicting healthcare outcomes in prematurely born infants using cluster analysis.

    Science.gov (United States)

    MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne

    2018-05-23

    Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.

  2. A piezoelectric electrospun platform for in situ cardiomyocyte contraction analysis

    Science.gov (United States)

    Beringer, Laura Toth

    Flexible, self-powered materials are in demand for a multitude of applications such as energy harvesting, robotic devices, and lab-on-a chip medical diagnostics. Lab-on-a-chip materials or cell-based biosensors can provide new diagnostic or therapeutic tools for numerous diseases. This dissertation explores the fabrication and characterization of a cell-based sensor termed a nanogenerator with three major aims. The first aim of this research was to fabricate a piezoelectric material that could act as both a cell scaffold and sensor and characterize the response to cell-scale deformation. Electrospinning piezoelectric fluoropolymers into nanofibers can provide both of these functionalities in a facile method. PVDF-TrFe was electrospun in an aligned format and interfaced with a flexible plastic substrate in order to create a platform for voltage response characterization after small force cantilever deformations. Voltage peak signals were an average of +/- 0.4 V, and this response did not change after platform sterilization. However, when placed in cell culture media, piezoelectric response was dampened, which was taken into consideration for the next two aims. An aligned electrospun coaxial fiber system of PVDF-TrFe and collagen was created and interfaced with the nanogenerator for the second aim in order to provide a more biologically favorable surface for cells to adhere to. These nanogenerators were successfully characterized for their piezoelectric response, which was an average of +/- 0.1 V. Additionally, the aligned coaxial collagen/PVDF-TrFe fibers supported both neuron and HeLa cell attachment and growth, demonstrating that they were not cytotoxic. To assess the potential for the nanogenerators to be used as a contractile analysis lab-on-a-chip based device, HeLa cell contraction was induced with potassium chloride and signal response was analyzed. The nanogenerator system was able to detect both the resting state of HeLa cells, a contraction state, and a

  3. Targeted capture and heterologous expression of the Pseudoalteromonas alterochromide gene cluster in Escherichia coli represents a promising natural product exploratory platform.

    Science.gov (United States)

    Ross, Avena C; Gulland, Lauren E S; Dorrestein, Pieter C; Moore, Bradley S

    2015-04-17

    Marine pseudoalteromonads represent a very promising source of biologically important natural product molecules. To access and exploit the full chemical capacity of these cosmopolitan Gram-(-) bacteria, we sought to apply universal synthetic biology tools to capture, refactor, and express biosynthetic gene clusters for the production of complex organic compounds in reliable host organisms. Here, we report a platform for the capture of proteobacterial gene clusters using a transformation-associated recombination (TAR) strategy coupled with direct pathway manipulation and expression in Escherichia coli. The ~34 kb pathway for production of alterochromide lipopeptides by Pseudoalteromonas piscicida JCM 20779 was captured and heterologously expressed in E. coli utilizing native and E. coli-based T7 promoter sequences. Our approach enabled both facile production of the alterochromides and in vivo interrogation of gene function associated with alterochromide's unusual brominated lipid side chain. This platform represents a simple but effective strategy for the discovery and biosynthetic characterization of natural products from marine proteobacteria.

  4. Demand Analysis of Logistics Information Matching Platform: A Survey from Highway Freight Market in Zhejiang Province

    Science.gov (United States)

    Chen, Daqiang; Shen, Xiahong; Tong, Bing; Zhu, Xiaoxiao; Feng, Tao

    With the increasing competition in logistics industry and promotion of lower logistics costs requirements, the construction of logistics information matching platform for highway transportation plays an important role, and the accuracy of platform design is the key to successful operation or not. Based on survey results of logistics service providers, customers and regulation authorities to access to information and in-depth information demand analysis of logistics information matching platform for highway transportation in Zhejiang province, a survey analysis for framework of logistics information matching platform for highway transportation is provided.

  5. Cluster analysis of radionuclide concentrations in beach sand

    NARCIS (Netherlands)

    de Meijer, R.J.; James, I.; Jennings, P.J.; Keoyers, J.E.

    This paper presents a method in which natural radionuclide concentrations of beach sand minerals are traced along a stretch of coast by cluster analysis. This analysis yields two groups of mineral deposit with different origins. The method deviates from standard methods of following dispersal of

  6. Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

    Science.gov (United States)

    Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

    2016-01-01

    Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…

  7. Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis

    Science.gov (United States)

    Sinyor, Mark; Schaffer, Ayal; Streiner, David L

    2014-01-01

    Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321

  8. Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

    Directory of Open Access Journals (Sweden)

    Wessel Jens

    2009-07-01

    Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.

  9. Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis

    Directory of Open Access Journals (Sweden)

    Gabjo Kim

    2016-12-01

    Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.

  10. CLUSTER ANALYSIS UKRAINIAN REGIONAL DISTRIBUTION BY LEVEL OF INNOVATION

    Directory of Open Access Journals (Sweden)

    Roman Shchur

    2016-07-01

    Full Text Available   SWOT-analysis of the threats and benefits of innovation development strategy of Ivano-Frankivsk region in the context of financial support was сonducted. Methodical approach to determine of public-private partnerships potential that is tool of innovative economic development financing was identified. Cluster analysis of possibilities of forming public-private partnership in a particular region was carried out. Optimal set of problem areas that require urgent solutions and financial security is defined on the basis of cluster approach. It will help to form practical recommendations for the formation of an effective financial mechanism in the regions of Ukraine. Key words: the mechanism of innovation development financial provision, innovation development, public-private partnerships, cluster analysis, innovative development strategy.

  11. Gene ARMADA: an integrated multi-analysis platform for microarray data implemented in MATLAB.

    Science.gov (United States)

    Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N

    2009-10-27

    The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime. Gene ARMADA provides a

  12. Fatigue analysis of assembled marine floating platform for special purposes under complex water environments

    Science.gov (United States)

    Ma, Guang-ying; Yao, Yun-long

    2018-03-01

    In this paper, the fatigue lives of a new type of assembled marine floating platform for special purposes were studied. Firstly, by using ANSYS AQWA software, the hydrodynamic model of the platform was established. Secondly, the structural stresses under alternating change loads were calculated under complex water environments, such as wind, wave, current and ice. The minimum fatigue lives were obtained under different working conditions. The analysis results showed that the fatigue life of the platform structure can meet the requirements

  13. Cluster-based analysis of multi-model climate ensembles

    Science.gov (United States)

    Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.

    2018-06-01

    Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and

  14. AZTLAN platform: Mexican platform for analysis and design of nuclear reactors; AZTLAN platform: plataforma mexicana para el analisis y diseno de reactores nucleares

    Energy Technology Data Exchange (ETDEWEB)

    Gomez T, A. M.; Puente E, F. [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico); Del Valle G, E. [IPN, Escuela Superior de Fisica y Matematicas, Av. IPN s/n, Edif. 9, Col. San Pedro Zacatenco, 07738 Mexico D. F. (Mexico); Francois L, J. L.; Martin del Campo M, C. [UNAM, Facultad de Ingenieria, Departamento de Sistemas Energeticos, Paseo Cuauhnahuac 8532, 62550 Jiutepec, Morelos (Mexico); Espinosa P, G., E-mail: armando.gomez@inin.gob.mx [Universidad Autonoma Metropolitana, Unidad Iztapalapa, Av. San Rafael Atlixco No. 186, Col. Vicentina, 09340 Mexico D. F. (Mexico)

    2014-10-15

    The Aztlan platform Project is a national initiative led by the Instituto Nacional de Investigaciones Nucleares (ININ) which brings together the main public houses of higher studies in Mexico, such as: Instituto Politecnico Nacional, Universidad Nacional Autonoma de Mexico and Universidad Autonoma Metropolitana in an effort to take a significant step toward the calculation autonomy and analysis that seeks to place Mexico in the medium term in a competitive international level on software issues for analysis of nuclear reactors. This project aims to modernize, improve and integrate the neutron, thermal-hydraulic and thermo-mechanical codes, developed in Mexican institutions, within an integrated platform, developed and maintained by Mexican experts to benefit from the same institutions. This project is financed by the mixed fund SENER-CONACYT of Energy Sustain ability, and aims to strengthen substantially to research institutions, such as educational institutions contributing to the formation of highly qualified human resources in the area of analysis and design of nuclear reactors. As innovative part the project includes the creation of a user group, made up of members of the project institutions as well as the Comision Nacional de Seguridad Nuclear y Salvaguardias, Central Nucleoelectrica de Laguna Verde (CNLV), Secretaria de Energia (Mexico) and Karlsruhe Institute of Technology (Germany) among others. This user group will be responsible for using the software and provide feedback to the development equipment in order that progress meets the needs of the regulator and industry; in this case the CNLV. Finally, in order to bridge the gap between similar developments globally, they will make use of the latest super computing technology to speed up calculation times. This work intends to present to national nuclear community the project, so a description of the proposed methodology is given, as well as the goals and objectives to be pursued for the development of the

  15. An Analysis of Business Models in Public Service Platforms

    DEFF Research Database (Denmark)

    Ranerup, Agneta; Zinner Henriksen, Helle; Hedman, Jonas

    2016-01-01

    Public Service Platforms (PSPs) are a new type of technology platform. They are based in the philosophy of New Public Management (NPM) and support public services for citizens in quasi-markets. This article increases our understanding of the business models behind these PSPs in terms of their Value......” that includes dialogues, user evaluations, long-term perspectives on choice, promotion of the ideal of choice, and self-promotion by public agencies. The article contributes to research with its empirical example of the digitalization of NPM and the underlying business logic of PSPs....

  16. Cluster Analysis as an Analytical Tool of Population Policy

    Directory of Open Access Journals (Sweden)

    Oksana Mikhaylovna Shubat

    2017-12-01

    Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.

  17. Automated analysis of organic particles using cluster SIMS

    Energy Technology Data Exchange (ETDEWEB)

    Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike

    2004-06-15

    Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF{sub 5}{sup +} or C{sub 8}{sup -}) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.

  18. A Platform for Scalable Satellite and Geospatial Data Analysis

    Science.gov (United States)

    Beneke, C. M.; Skillman, S.; Warren, M. S.; Kelton, T.; Brumby, S. P.; Chartrand, R.; Mathis, M.

    2017-12-01

    At Descartes Labs, we use the commercial cloud to run global-scale machine learning applications over satellite imagery. We have processed over 5 Petabytes of public and commercial satellite imagery, including the full Landsat and Sentinel archives. By combining open-source tools with a FUSE-based filesystem for cloud storage, we have enabled a scalable compute platform that has demonstrated reading over 200 GB/s of satellite imagery into cloud compute nodes. In one application, we generated global 15m Landsat-8, 20m Sentinel-1, and 10m Sentinel-2 composites from 15 trillion pixels, using over 10,000 CPUs. We recently created a public open-source Python client library that can be used to query and access preprocessed public satellite imagery from within our platform, and made this platform available to researchers for non-commercial projects. In this session, we will describe how you can use the Descartes Labs Platform for rapid prototyping and scaling of geospatial analyses and demonstrate examples in land cover classification.

  19. Assessment of surface water quality using hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Dheeraj Kumar Dabgerwal

    2016-02-01

    Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016,  pp: 32-44

  20. application of single-linkage clustering method in the analysis of ...

    African Journals Online (AJOL)

    Admin

    ANALYSIS OF GROWTH RATE OF GROSS DOMESTIC PRODUCT. (GDP) AT ... The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are ..... Number of cluster sum from from observations of ...

  1. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    Science.gov (United States)

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  2. Transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis

    Directory of Open Access Journals (Sweden)

    Riccardi Giovanna

    2009-03-01

    Full Text Available Abstract Background The ESAT-6 (early secreted antigenic target, 6 kDa family collects small mycobacterial proteins secreted by Mycobacterium tuberculosis, particularly in the early phase of growth. There are 23 ESAT-6 family members in M. tuberculosis H37Rv. In a previous work, we identified the Zur- dependent regulation of five proteins of the ESAT-6/CFP-10 family (esxG, esxH, esxQ, esxR, and esxS. esxG and esxH are part of ESAT-6 cluster 3, whose expression was already known to be induced by iron starvation. Results In this research, we performed EMSA experiments and transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis (msmeg0615-msmeg0625 and M. tuberculosis. In contrast to what we had observed in M. tuberculosis, we found that in M. smegmatis ESAT-6 cluster 3 responds only to iron and not to zinc. In both organisms we identified an internal promoter, a finding which suggests the presence of two transcriptional units and, by consequence, a differential expression of cluster 3 genes. We compared the expression of msmeg0615 and msmeg0620 in different growth and stress conditions by means of relative quantitative PCR. The expression of msmeg0615 and msmeg0620 genes was essentially similar; they appeared to be repressed in most of the tested conditions, with the exception of acid stress (pH 4.2 where msmeg0615 was about 4-fold induced, while msmeg0620 was repressed. Analysis revealed that in acid stress conditions M. tuberculosis rv0282 gene was 3-fold induced too, while rv0287 induction was almost insignificant. Conclusion In contrast with what has been reported for M. tuberculosis, our results suggest that in M. smegmatis only IdeR-dependent regulation is retained, while zinc has no effect on gene expression. The role of cluster 3 in M. tuberculosis virulence is still to be defined; however, iron- and zinc-dependent expression strongly suggests that cluster 3 is highly expressed in the infective process, and that the cluster

  3. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

    Directory of Open Access Journals (Sweden)

    Craig A Gedye

    Full Text Available Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell

  4. Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

    Science.gov (United States)

    Gedye, Craig A; Hussain, Ali; Paterson, Joshua; Smrke, Alannah; Saini, Harleen; Sirskyj, Danylo; Pereira, Keira; Lobo, Nazleen; Stewart, Jocelyn; Go, Christopher; Ho, Jenny; Medrano, Mauricio; Hyatt, Elzbieta; Yuan, Julie; Lauriault, Stevan; Meyer, Mona; Kondratyev, Maria; van den Beucken, Twan; Jewett, Michael; Dirks, Peter; Guidos, Cynthia J; Danska, Jayne; Wang, Jean; Wouters, Bradly; Neel, Benjamin; Rottapel, Robert; Ailles, Laurie E

    2014-01-01

    Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC) platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell markers.

  5. Graph analysis of cell clusters forming vascular networks

    Science.gov (United States)

    Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.

    2018-03-01

    This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.

  6. clusters

    Indian Academy of Sciences (India)

    2017-09-27

    Sep 27, 2017 ... Author for correspondence (zh4403701@126.com). MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...

  7. clusters

    Indian Academy of Sciences (India)

    environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.

  8. Cluster Analysis of International Information and Social Development.

    Science.gov (United States)

    Lau, Jesus

    1990-01-01

    Analyzes information activities in relation to socioeconomic characteristics in low, middle, and highly developed economies for the years 1960 and 1977 through the use of cluster analysis. Results of data from 31 countries suggest that information development is achieved mainly by countries that have also achieved social development. (26…

  9. Making Sense of Cluster Analysis: Revelations from Pakistani Science Classes

    Science.gov (United States)

    Pell, Tony; Hargreaves, Linda

    2011-01-01

    Cluster analysis has been applied to quantitative data in educational research over several decades and has been a feature of the Maurice Galton's research in primary and secondary classrooms. It has offered potentially useful insights for teaching yet its implications for practice are rarely implemented. It has been subject also to negative…

  10. Cluster analysis for validated climatology stations using precipitation in Mexico

    NARCIS (Netherlands)

    Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.

    2012-01-01

    Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40

  11. A Cluster Analysis of Personality Style in Adults with ADHD

    Science.gov (United States)

    Robin, Arthur L.; Tzelepis, Angela; Bedway, Marquita

    2008-01-01

    Objective: The purpose of this study was to use hierarchical linear cluster analysis to examine the normative personality styles of adults with ADHD. Method: A total of 311 adults with ADHD completed the Millon Index of Personality Styles, which consists of 24 scales assessing motivating aims, cognitive modes, and interpersonal behaviors. Results:…

  12. Characterization of population exposure to organochlorines: A cluster analysis application

    NARCIS (Netherlands)

    R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)

    2013-01-01

    textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues

  13. Robustness in cluster analysis in the presence of anomalous observations

    NARCIS (Netherlands)

    Zhuk, EE

    Cluster analysis of multivariate observations in the presence of "outliers" (anomalous observations) in a sample is studied. The expected (mean) fraction of erroneous decisions for the decision rule is computed analytically by minimizing the intraclass scatter. A robust decision rule (stable to

  14. Language Learner Motivational Types: A Cluster Analysis Study

    Science.gov (United States)

    Papi, Mostafa; Teimouri, Yasser

    2014-01-01

    The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…

  15. Cluster analysis as a prediction tool for pregnancy outcomes.

    Science.gov (United States)

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  16. Performance Analysis of Unsupervised Clustering Methods for Brain Tumor Segmentation

    Directory of Open Access Journals (Sweden)

    Tushar H Jaware

    2013-10-01

    Full Text Available Medical image processing is the most challenging and emerging field of neuroscience. The ultimate goal of medical image analysis in brain MRI is to extract important clinical features that would improve methods of diagnosis & treatment of disease. This paper focuses on methods to detect & extract brain tumour from brain MR images. MATLAB is used to design, software tool for locating brain tumor, based on unsupervised clustering methods. K-Means clustering algorithm is implemented & tested on data base of 30 images. Performance evolution of unsupervised clusteringmethods is presented.

  17. Identifying clinical course patterns in SMS data using cluster analysis

    DEFF Research Database (Denmark)

    Kent, Peter; Kongsted, Alice

    2012-01-01

    ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there are alternative ways of managing SMS data and many different methods...

  18. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    Directory of Open Access Journals (Sweden)

    Jessie J Hsu

    Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  19. A portable chemotaxis platform for short and long term analysis.

    Directory of Open Access Journals (Sweden)

    Chenjie Xu

    Full Text Available Flow-based microfluidic systems have been widely utilized for cell migration studies given their ability to generate versatile and precisely defined chemical gradients and to permit direct visualization of migrating cells. Nonetheless, the general need for bulky peripherals such as mechanical pumps and tubing and the complicated setup procedures significantly limit the widespread use of these microfluidic systems for cell migration studies. Here we present a simple method to power microfluidic devices for chemotaxis assays using the commercially available ALZET® osmotic pumps. Specifically, we developed a standalone chemotaxis platform that has the same footprint as a multiwell plate and can generate well-defined, stable chemical gradients continuously for up to 7 days. Using this platform, we validated the short-term (24 hours and long-term (72 hours concentration dependent PDGF-BB chemotaxis response of human bone marrow derived mesenchymal stem cells.

  20. A supply sided analysis of leading MOOC platforms and universities

    Directory of Open Access Journals (Sweden)

    Georg Peters

    2016-03-01

    Full Text Available Investing in education is generally considered as a promising strategy to fight poverty and increase prosperity. This applies to all levels of an economy reaching from individuals to local communities and countries and has a global perspective as well. However, high-quality education is often costly and not available anytime anywhere. Therefore, any promising concept that might help to democratize education is worth pursuing, in a sense that it makes education accessible for everybody without any restrictions. The characteristics attributed to MOOC – Massive Open Online Courses are promising to contribute to this objective. Hence, our objective is to analyse MOOC as it currently operates. Obviously, there is a huge demand for free high-quality education anytime anywhere but a shortage on the supply side. So, we will concentrate on supply-sided effects and study MOOC platforms as well as content providers, particularly universities. We focus our research on some of the leading platforms and universities worldwide. Relative to their size Australia and the Netherlands are very active players in the MOOC sector. Germany is lagging behind and leading universities in the UK seem to virtually refrain from offering MOOC. Our research also shows the leading role of US universities and platform providers.

  1. High-dimensional cluster analysis with the Masked EM Algorithm

    Science.gov (United States)

    Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.

    2014-01-01

    Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694

  2. A cluster analysis investigation of workaholism as a syndrome.

    Science.gov (United States)

    Aziz, Shahnaz; Zickar, Michael J

    2006-01-01

    Workaholism has been conceptualized as a syndrome although there have been few tests that explicitly consider its syndrome status. The authors analyzed a three-dimensional scale of workaholism developed by Spence and Robbins (1992) using cluster analysis. The authors identified three clusters of individuals, one of which corresponded to Spence and Robbins's profile of the workaholic (high work involvement, high drive to work, low work enjoyment). Consistent with previously conjectured relations with workaholism, individuals in the workaholic cluster were more likely to label themselves as workaholics, more likely to have acquaintances label them as workaholics, and more likely to have lower life satisfaction and higher work-life imbalance. The importance of considering workaholism as a syndrome and the implications for effective interventions are discussed. Copyright 2006 APA.

  3. Cosmological analysis of galaxy clusters surveys in X-rays

    International Nuclear Information System (INIS)

    Clerc, N.

    2012-01-01

    Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr

  4. Cluster analysis by optimal decomposition of induced fuzzy sets

    Energy Technology Data Exchange (ETDEWEB)

    Backer, E

    1978-01-01

    Nonsupervised pattern recognition is addressed and the concept of fuzzy sets is explored in order to provide the investigator (data analyst) additional information supplied by the pattern class membership values apart from the classical pattern class assignments. The basic ideas behind the pattern recognition problem, the clustering problem, and the concept of fuzzy sets in cluster analysis are discussed, and a brief review of the literature of the fuzzy cluster analysis is given. Some mathematical aspects of fuzzy set theory are briefly discussed; in particular, a measure of fuzziness is suggested. The optimization-clustering problem is characterized. Then the fundamental idea behind affinity decomposition is considered. Next, further analysis takes place with respect to the partitioning-characterization functions. The iterative optimization procedure is then addressed. The reclassification function is investigated and convergence properties are examined. Finally, several experiments in support of the method suggested are described. Four object data sets serve as appropriate test cases. 120 references, 70 figures, 11 tables. (RWR)

  5. Envri Cluster - a Community-Driven Platform of European Environmental Researcher Infrastructures for Providing Common E-Solutions for Earth Science

    Science.gov (United States)

    Asmi, A.; Sorvari, S.; Kutsch, W. L.; Laj, P.

    2017-12-01

    European long-term environmental research infrastructures (often referred as ESFRI RIs) are the core facilities for providing services for scientists in their quest for understanding and predicting the complex Earth system and its functioning that requires long-term efforts to identify environmental changes (trends, thresholds and resilience, interactions and feedbacks). Many of the research infrastructures originally have been developed to respond to the needs of their specific research communities, however, it is clear that strong collaboration among research infrastructures is needed to serve the trans-boundary research requires exploring scientific questions at the intersection of different scientific fields, conducting joint research projects and developing concepts, devices, and methods that can be used to integrate knowledge. European Environmental research infrastructures have already been successfully worked together for many years and have established a cluster - ENVRI cluster - for their collaborative work. ENVRI cluster act as a collaborative platform where the RIs can jointly agree on the common solutions for their operations, draft strategies and policies and share best practices and knowledge. Supporting project for the ENVRI cluster, ENVRIplus project, brings together 21 European research infrastructures and infrastructure networks to work on joint technical solutions, data interoperability, access management, training, strategies and dissemination efforts. ENVRI cluster act as one stop shop for multidisciplinary RI users, other collaborative initiatives, projects and programmes and coordinates and implement jointly agreed RI strategies.

  6. MiSTIC, an integrated platform for the analysis of heterogeneity in large tumour transcriptome datasets.

    Science.gov (United States)

    Lemieux, Sebastien; Sargeant, Tobias; Laperrière, David; Ismail, Houssam; Boucher, Geneviève; Rozendaal, Marieke; Lavallée, Vincent-Philippe; Ashton-Beaucage, Dariel; Wilhelm, Brian; Hébert, Josée; Hilton, Douglas J; Mader, Sylvie; Sauvageau, Guy

    2017-07-27

    Genome-wide transcriptome profiling has enabled non-supervised classification of tumours, revealing different sub-groups characterized by specific gene expression features. However, the biological significance of these subtypes remains for the most part unclear. We describe herein an interactive platform, Minimum Spanning Trees Inferred Clustering (MiSTIC), that integrates the direct visualization and comparison of the gene correlation structure between datasets, the analysis of the molecular causes underlying co-variations in gene expression in cancer samples, and the clinical annotation of tumour sets defined by the combined expression of selected biomarkers. We have used MiSTIC to highlight the roles of specific transcription factors in breast cancer subtype specification, to compare the aspects of tumour heterogeneity targeted by different prognostic signatures, and to highlight biomarker interactions in AML. A version of MiSTIC preloaded with datasets described herein can be accessed through a public web server (http://mistic.iric.ca); in addition, the MiSTIC software package can be obtained (github.com/iric-soft/MiSTIC) for local use with personalized datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Multiscale deep drawing analysis of dual-phase steels using grain cluster-based RGC scheme

    International Nuclear Information System (INIS)

    Tjahjanto, D D; Eisenlohr, P; Roters, F

    2015-01-01

    Multiscale modelling and simulation play an important role in sheet metal forming analysis, since the overall material responses at macroscopic engineering scales, e.g. formability and anisotropy, are strongly influenced by microstructural properties, such as grain size and crystal orientations (texture). In the present report, multiscale analysis on deep drawing of dual-phase steels is performed using an efficient grain cluster-based homogenization scheme.The homogenization scheme, called relaxed grain cluster (RGC), is based on a generalization of the grain cluster concept, where a (representative) volume element consists of p  ×  q  ×  r (hexahedral) grains. In this scheme, variation of the strain or deformation of individual grains is taken into account through the, so-called, interface relaxation, which is formulated within an energy minimization framework. An interfacial penalty term is introduced into the energy minimization framework in order to account for the effects of grain boundaries.The grain cluster-based homogenization scheme has been implemented and incorporated into the advanced material simulation platform DAMASK, which purposes to bridge the macroscale boundary value problems associated with deep drawing analysis to the micromechanical constitutive law, e.g. crystal plasticity model. Standard Lankford anisotropy tests are performed to validate the model parameters prior to the deep drawing analysis. Model predictions for the deep drawing simulations are analyzed and compared to the corresponding experimental data. The result shows that the predictions of the model are in a very good agreement with the experimental measurement. (paper)

  8. Integrated vehicle-based safety systems (IVBSS) : light vehicle platform field operational test data analysis plan.

    Science.gov (United States)

    2009-12-22

    This document presents the University of Michigan Transportation Research Institutes plan to : perform analysis of data collected from the light vehicle platform field operational test of the : Integrated Vehicle-Based Safety Systems (IVBSS) progr...

  9. Integrated vehicle-based safety systems (IVBSS) : heavy truck platform field operational test data analysis plan.

    Science.gov (United States)

    2009-11-23

    This document presents the University of Michigan Transportation Research Institutes plan to perform : analysis of data collected from the heavy truck platform field operational test of the Integrated Vehicle- : Based Safety Systems (IVBSS) progra...

  10. Paleomagnetism.org : An online multi-platform open source environment for paleomagnetic data analysis

    NARCIS (Netherlands)

    Koymans, Mathijs R.; Langereis, C.G.; Pastor-Galán, D.; van Hinsbergen, D.J.J.

    2016-01-01

    This contribution provides an overview of Paleomagnetism.org, an open-source, multi-platform online environment for paleomagnetic data analysis. Paleomagnetism.org provides an interactive environment where paleomagnetic data can be interpreted, evaluated, visualized, and exported. The

  11. DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

    Directory of Open Access Journals (Sweden)

    Alexander Chailytko

    2016-05-01

    Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.

  12. Analysis of RXTE data on Clusters of Galaxies

    Science.gov (United States)

    Petrosian, Vahe

    2004-01-01

    This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument

  13. Mobility in Europe: Recent Trends from a Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Ioana Manafi

    2017-08-01

    Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.

  14. Full text clustering and relationship network analysis of biomedical publications.

    Directory of Open Access Journals (Sweden)

    Renchu Guan

    Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  15. The Productivity Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, E.

    2014-07-01

    Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.

  16. Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.

    Science.gov (United States)

    Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz

    2016-07-01

    Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  17. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    Science.gov (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  18. Full text clustering and relationship network analysis of biomedical publications.

    Science.gov (United States)

    Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu

    2014-01-01

    Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.

  19. Feasibility study of BES data processing and physics analysis on a PC/Linux platform

    International Nuclear Information System (INIS)

    Rong Gang; He Kanglin; Zhao Jiawei; Heng Yuekun; Zhang Chun

    1999-01-01

    The authors report a feasibility study of off-line BES data processing (data reconstruction and Detector simulation) on a PC/Linux platform and an application of the PC/Linux system in D/Ds physics analysis. The authors compared the results obtained from the PC/Linux with that from HP workstation. It shows that PC/Linux platform can do BES data offline analysis as good as UNIX workstation do, but it is much powerful and economical

  20. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    2009-09-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  1. Analysis and design of energy monitoring platform for smart city

    Science.gov (United States)

    Wang, Hong-xia

    2016-09-01

    The development and utilization of energy has greatly promoted the development and progress of human society. It is the basic material foundation for human survival. City running is bound to consume energy inevitably, but it also brings a lot of waste discharge. In order to speed up the process of smart city, improve the efficiency of energy saving and emission reduction work, maintain the green and livable environment, a comprehensive management platform of energy monitoring for government departments is constructed based on cloud computing technology and 3-tier architecture in this paper. It is assumed that the system will provide scientific guidance for the environment management and decision making in smart city.

  2. The Quantitative Analysis of Chennai Automotive Industry Cluster

    Science.gov (United States)

    Bhaskaran, Ethirajan

    2016-07-01

    Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity

  3. Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

    Directory of Open Access Journals (Sweden)

    Jocelyn H Bolin

    2014-04-01

    Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  4. Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

    Science.gov (United States)

    Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

    2014-01-01

    Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.

  5. TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

    Science.gov (United States)

    Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

    2017-12-01

    Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at

  6. Analysis and experiments of a novel and compact 3-DOF precision positioning platform

    International Nuclear Information System (INIS)

    Huang, Hu; Zhao, Hongwei; Fan, Zunqiang; Zhang, Hui; Ma, Zhichao; Yang, Zhaojun

    2013-01-01

    A novel 3-DOF precision positioning platform with dimensions of 48 mm X 50 mm X 35 mm was designed by integrating piezo actuators and flexure hinges. The platform has a compact structure but it can do high precision positioning in three axes. The dynamic model of the platform in a single direction was established. Stiffness of the flexure hinges and modal characteristics of the flexure hinge mechanism were analyzed by the finite element method. Output displacements of the platform along three axes were forecasted via stiffness analysis. Output performance of the platform in x and y axes with open-loop control as well as the z-axis with closed-loop control was tested and discussed. The preliminary application of the platform in the field of nanoindentation indicates that the designed platform works well during nanoindentation tests, and the closed-loop control ensures the linear displacement output. With suitable control, the platform has the potential to realize different positioning functions under various working conditions.

  7. Statistical analysis of the spatial distribution of galaxies and clusters

    International Nuclear Information System (INIS)

    Cappi, Alberto

    1993-01-01

    This thesis deals with the analysis of the distribution of galaxies and clusters, describing some observational problems and statistical results. First chapter gives a theoretical introduction, aiming to describe the framework of the formation of structures, tracing the history of the Universe from the Planck time, t_p = 10"-"4"3 sec and temperature corresponding to 10"1"9 GeV, to the present epoch. The most usual statistical tools and models of the galaxy distribution, with their advantages and limitations, are described in chapter two. A study of the main observed properties of galaxy clustering, together with a detailed statistical analysis of the effects of selecting galaxies according to apparent magnitude or diameter, is reported in chapter three. Chapter four delineates some properties of groups of galaxies, explaining the reasons of discrepant results on group distributions. Chapter five is a study of the distribution of galaxy clusters, with different statistical tools, like correlations, percolation, void probability function and counts in cells; it is found the same scaling-invariant behaviour of galaxies. Chapter six describes our finding that rich galaxy clusters too belong to the fundamental plane of elliptical galaxies, and gives a discussion of its possible implications. Finally chapter seven reviews the possibilities offered by multi-slit and multi-fibre spectrographs, and I present some observational work on nearby and distant galaxy clusters. In particular, I show the opportunities offered by ongoing surveys of galaxies coupled with multi-object fibre spectrographs, focusing on the ESO Key Programme A galaxy redshift survey in the south galactic pole region to which I collaborate and on MEFOS, a multi-fibre instrument with automatic positioning. Published papers related to the work described in this thesis are reported in the last appendix. (author) [fr

  8. Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.

    Science.gov (United States)

    Ben-Sasson, Ayelet; Podoly, Tamar Yonit

    2017-02-01

    Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a

  9. Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations

    Directory of Open Access Journals (Sweden)

    F. Darrouzet

    2006-07-01

    Full Text Available Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles have been derived from the electron plasma frequency identified by the WHISPER sounder supplemented, in-between soundings, by relative variations of the spacecraft potential measured by the electric field instrument EFW; ion velocity is also measured onboard these satellites. The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 min; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations of three plume events and compare CLUSTER in-situ data with global images of the plasmasphere obtained by IMAGE. In particular, we study the geometry and the orientation of plasmaspheric plumes by using four-point analysis methods. We compare several aspects of plume motion as determined by different methods: (i inner and outer plume boundary velocity calculated from time delays of this boundary as observed by the wave experiment WHISPER on the four spacecraft, (ii drift velocity measured by the electron drift instrument EDI onboard CLUSTER and (iii global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.

  10. Hazard Analysis for the Pretreatment Engineering Platform (PEP)

    Energy Technology Data Exchange (ETDEWEB)

    Sullivan, Robin S.; Geeting, John GH; Lawrence, Wesley E.; Young, Jonathan

    2008-07-10

    The Pretreatment Engineering Platform (PEP) is designed to perform a demonstration on an engineering scale to confirm the Hanford Waste Treatment Plant Pretreatment Facility (PTF) leaching and filtration process equipment design and sludge treatment process. The system will use scaled prototypic equipment to demonstrate sludge water wash, caustic leaching, oxidative leaching, and filtration. Unit operations to be tested include pumping, solids washing, chemical reagent addition and blending, heating, cooling, leaching, filtration, and filter cleaning. In addition, the PEP will evaluate potential design changes to the ultrafiltration process system equipment to potentially enhance leaching and filtration performance as well as overall pretreatment throughput. The skid-mounted system will be installed and operated in the Processing Development Laboratory-West at Pacific Northwest National Laboratory (PNNL) in Richland, Washington.

  11. ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data.

    Science.gov (United States)

    Gardeux, Vincent; David, Fabrice P A; Shajkofci, Adrian; Schwalie, Petra C; Deplancke, Bart

    2017-10-01

    Single-cell RNA-sequencing (scRNA-seq) allows whole transcriptome profiling of thousands of individual cells, enabling the molecular exploration of tissues at the cellular level. Such analytical capacity is of great interest to many research groups in the world, yet these groups often lack the expertise to handle complex scRNA-seq datasets. We developed a fully integrated, web-based platform aimed at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. This Automated Single-cell Analysis Pipeline (ASAP) combines a wide range of commonly used algorithms with sophisticated visualization tools. Compared with existing scRNA-seq analysis platforms, researchers (including those lacking computational expertise) are able to interact with the data in a straightforward fashion and in real time. Furthermore, given the overlap between scRNA-seq and bulk RNA-seq analysis workflows, ASAP should conceptually be broadly applicable to any RNA-seq dataset. As a validation, we demonstrate how we can use ASAP to simply reproduce the results from a single-cell study of 91 mouse cells involving five distinct cell types. The tool is freely available at asap.epfl.ch and R/Python scripts are available at github.com/DeplanckeLab/ASAP. bart.deplancke@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  12. HORIZONTAL BRANCH MORPHOLOGY OF GLOBULAR CLUSTERS: A MULTIVARIATE STATISTICAL ANALYSIS

    International Nuclear Information System (INIS)

    Jogesh Babu, G.; Chattopadhyay, Tanuka; Chattopadhyay, Asis Kumar; Mondal, Saptarshi

    2009-01-01

    The proper interpretation of horizontal branch (HB) morphology is crucial to the understanding of the formation history of stellar populations. In the present study a multivariate analysis is used (principal component analysis) for the selection of appropriate HB morphology parameter, which, in our case, is the logarithm of effective temperature extent of the HB (log T effHB ). Then this parameter is expressed in terms of the most significant observed independent parameters of Galactic globular clusters (GGCs) separately for coherent groups, obtained in a previous work, through a stepwise multiple regression technique. It is found that, metallicity ([Fe/H]), central surface brightness (μ v ), and core radius (r c ) are the significant parameters to explain most of the variations in HB morphology (multiple R 2 ∼ 0.86) for GGC elonging to the bulge/disk while metallicity ([Fe/H]) and absolute magnitude (M v ) are responsible for GGC belonging to the inner halo (multiple R 2 ∼ 0.52). The robustness is tested by taking 1000 bootstrap samples. A cluster analysis is performed for the red giant branch (RGB) stars of the GGC belonging to Galactic inner halo (Cluster 2). A multi-episodic star formation is preferred for RGB stars of GGC belonging to this group. It supports the asymptotic giant branch (AGB) model in three episodes instead of two as suggested by Carretta et al. for halo GGC while AGB model is suggested to be revisited for bulge/disk GGC.

  13. Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio.

    Science.gov (United States)

    Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh

    2012-01-01

    Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to

  14. Performance Based Clustering for Benchmarking of Container Ports: an Application of Dea and Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Jie Wu

    2010-12-01

    Full Text Available The operational performance of container ports has received more and more attentions in both academic and practitioner circles, the performance evaluation and process improvement of container ports have also been the focus of several studies. In this paper, Data Envelopment Analysis (DEA, an effective tool for relative efficiency assessment, is utilized for measuring the performances and benchmarking of the 77 world container ports in 2007. The used approaches in the current study consider four inputs (Capacity of Cargo Handling Machines, Number of Berths, Terminal Area and Storage Capacity and a single output (Container Throughput. The results for the efficiency scores are analyzed, and a unique ordering of the ports based on average cross efficiency is provided, also cluster analysis technique is used to select the more appropriate targets for poorly performing ports to use as benchmarks.

  15. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

    Science.gov (United States)

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  16. Redefining the Breast Cancer Exosome Proteome by Tandem Mass Tag Quantitative Proteomics and Multivariate Cluster Analysis.

    Science.gov (United States)

    Clark, David J; Fondrie, William E; Liao, Zhongping; Hanson, Phyllis I; Fulton, Amy; Mao, Li; Yang, Austin J

    2015-10-20

    Exosomes are microvesicles of endocytic origin constitutively released by multiple cell types into the extracellular environment. With evidence that exosomes can be detected in the blood of patients with various malignancies, the development of a platform that uses exosomes as a diagnostic tool has been proposed. However, it has been difficult to truly define the exosome proteome due to the challenge of discerning contaminant proteins that may be identified via mass spectrometry using various exosome enrichment strategies. To better define the exosome proteome in breast cancer, we incorporated a combination of Tandem-Mass-Tag (TMT) quantitative proteomics approach and Support Vector Machine (SVM) cluster analysis of three conditioned media derived fractions corresponding to a 10 000g cellular debris pellet, a 100 000g crude exosome pellet, and an Optiprep enriched exosome pellet. The quantitative analysis identified 2 179 proteins in all three fractions, with known exosomal cargo proteins displaying at least a 2-fold enrichment in the exosome fraction based on the TMT protein ratios. Employing SVM cluster analysis allowed for the classification 251 proteins as "true" exosomal cargo proteins. This study provides a robust and vigorous framework for the future development of using exosomes as a potential multiprotein marker phenotyping tool that could be useful in breast cancer diagnosis and monitoring disease progression.

  17. A DESIGN OF SOCIAL MEDIA ANALYSIS SYSTEM BASED ON MOBILE PLATFORMS

    OpenAIRE

    Alaybeyoglu, Aysegul; Yavuz, Levent

    2017-01-01

    Alongwith the developing technology, social media technologies have becomewidespread and the number of internet users has increased rapidly. In addition,social media platforms have become very popular and the number of active socialmedia users has increased considerably. As a result of the increased use ofsocial media, there has been a trend towards mobile platforms. In this paper, adesign of a social media analysis system is developed using mobile platformsbased on Android. By this way, impo...

  18. Diagnostics of subtropical plants functional state by cluster analysis

    Directory of Open Access Journals (Sweden)

    Oksana Belous

    2016-05-01

    Full Text Available The article presents an application example of statistical methods for data analysis on diagnosis of the adaptive capacity of subtropical plants varieties. We depicted selection indicators and basic physiological parameters that were defined as diagnostic. We used evaluation on a set of parameters of water regime, there are: determination of water deficit of the leaves, determining the fractional composition of water and detection parameters of the concentration of cell sap (CCS (for tea culture flushes. These settings are characterized by high liability and high responsiveness to the effects of many abiotic factors that determined the particular care in the selection of plant material for analysis and consideration of the impact on sustainability. On the basis of the experimental data calculated the coefficients of pair correlation between climatic factors and used physiological indicators. The result was a selection of physiological and biochemical indicators proposed to assess the adaptability and included in the basis of methodical recommendations on diagnostics of the functional state of the studied cultures. Analysis of complex studies involving a large number of indicators is quite difficult, especially does not allow to quickly identify the similarity of new varieties for their adaptive responses to adverse factors, and, therefore, to set general requirements to conditions of cultivation. Use of cluster analysis suggests that in the analysis of only quantitative data; define a set of variables used to assess varieties (and the more sampling, the more accurate the clustering will happen, be sure to ascertain the measure of similarity (or difference between objects. It is shown that the identification of diagnostic features, which are subjected to statistical processing, impact the accuracy of the varieties classification. Selection in result of the mono-clusters analysis (variety tea Kolhida; hazelnut Lombardsky red; variety kiwi Monty

  19. Cluster analysis for DNA methylation profiles having a detection threshold

    Directory of Open Access Journals (Sweden)

    Siegmund Kimberly D

    2006-07-01

    Full Text Available Abstract Background DNA methylation, a molecular feature used to investigate tumor heterogeneity, can be measured on many genomic regions using the MethyLight technology. Due to the combination of the underlying biology of DNA methylation and the MethyLight technology, the measurements, while being generated on a continuous scale, have a large number of 0 values. This suggests that conventional clustering methodology may not perform well on this data. Results We compare performance of existing methodology (such as k-means with two novel methods that explicitly allow for the preponderance of values at 0. We also consider how the ability to successfully cluster such data depends upon the number of informative genes for which methylation is measured and the correlation structure of the methylation values for those genes. We show that when data is collected for a sufficient number of genes, our models do improve clustering performance compared to methods, such as k-means, that do not explicitly respect the supposed biological realities of the situation. Conclusion The performance of analysis methods depends upon how well the assumptions of those methods reflect the properties of the data being analyzed. Differing technologies will lead to data with differing properties, and should therefore be analyzed differently. Consequently, it is prudent to give thought to what the properties of the data are likely to be, and which analysis method might therefore be likely to best capture those properties.

  20. Cluster Analysis of the International Stellarator Confinement Database

    International Nuclear Information System (INIS)

    Kus, A.; Dinklage, A.; Preuss, R.; Ascasibar, E.; Harris, J. H.; Okamura, S.; Yamada, H.; Sano, F.; Stroth, U.; Talmadge, J.

    2008-01-01

    Heterogeneous structure of collected data is one of the problems that occur during derivation of scalings for energy confinement time, and whose analysis tourns out to be wide and complicated matter. The International Stellarator Confinement Database [1], shortly ISCDB, comprises in its latest version 21 a total of 3647 observations from 8 experimental devices, 2067 therefrom beeing so far completed for upcoming analyses. For confinement scaling studies 1933 observation were chosen as the standard dataset. Here we describe a statistical method of cluster analysis for identification of possible cohesive substructures in ISDCB and present some preliminary results

  1. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    Science.gov (United States)

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  2. Accommodating error analysis in comparison and clustering of molecular fingerprints.

    Science.gov (United States)

    Salamon, H; Segal, M R; Ponce de Leon, A; Small, P M

    1998-01-01

    Molecular epidemiologic studies of infectious diseases rely on pathogen genotype comparisons, which usually yield patterns comprising sets of DNA fragments (DNA fingerprints). We use a highly developed genotyping system, IS6110-based restriction fragment length polymorphism analysis of Mycobacterium tuberculosis, to develop a computational method that automates comparison of large numbers of fingerprints. Because error in fragment length measurements is proportional to fragment length and is positively correlated for fragments within a lane, an align-and-count method that compensates for relative scaling of lanes reliably counts matching fragments between lanes. Results of a two-step method we developed to cluster identical fingerprints agree closely with 5 years of computer-assisted visual matching among 1,335 M. tuberculosis fingerprints. Fully documented and validated methods of automated comparison and clustering will greatly expand the scope of molecular epidemiology.

  3. Accident patterns for construction-related workers: a cluster analysis

    Science.gov (United States)

    Liao, Chia-Wen; Tyan, Yaw-Yauan

    2012-01-01

    The construction industry has been identified as one of the most hazardous industries. The risk of constructionrelated workers is far greater than that in a manufacturing based industry. However, some steps can be taken to reduce worker risk through effective injury prevention strategies. In this article, k-means clustering methodology is employed in specifying the factors related to different worker types and in identifying the patterns of industrial occupational accidents. Accident reports during the period 1998 to 2008 are extracted from case reports of the Northern Region Inspection Office of the Council of Labor Affairs of Taiwan. The results show that the cluster analysis can indicate some patterns of occupational injuries in the construction industry. Inspection plans should be proposed according to the type of construction-related workers. The findings provide a direction for more effective inspection strategies and injury prevention programs.

  4. Cluster analysis in systems of magnetic spheres and cubes

    Energy Technology Data Exchange (ETDEWEB)

    Pyanzina, E.S., E-mail: elena.pyanzina@urfu.ru [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Gudkova, A.V. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); Donaldson, J.G. [University of Vienna, Sensengasse 8, Vienna (Austria); Kantorovich, S.S. [Ural Federal University, Lenin Av. 51, Ekaterinburg (Russian Federation); University of Vienna, Sensengasse 8, Vienna (Austria)

    2017-06-01

    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube. - Highlights: • A comparison of the degree of self-assembly in systems of magnetic spheres and cubes. • Spheres are more likely to form larger clusters than cubes. • Differences in microstructure will manifest in the magnetic response of each system.

  5. Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis

    Science.gov (United States)

    Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song

    2018-01-01

    To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.

  6. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  7. Staghorn: An Automated Large-Scale Distributed System Analysis Platform

    Energy Technology Data Exchange (ETDEWEB)

    Gabert, Kasimir [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Burns, Ian [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Elliott, Steven [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kallaher, Jenna [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Vail, Adam [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2016-09-01

    Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model, either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \\freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.

  8. Analysis of Surface Heterogeneity Effects with Mesoscale Terrestrial Modeling Platforms

    Science.gov (United States)

    Simmer, C.

    2015-12-01

    An improved understanding of the full variability in the weather and climate system is crucial for reducing the uncertainty in weather forecasting and climate prediction, and to aid policy makers to develop adaptation and mitigation strategies. A yet unknown part of uncertainty in the predictions from the numerical models is caused by the negligence of non-resolved land surface heterogeneity and the sub-surface dynamics and their potential impact on the state of the atmosphere. At the same time, mesoscale numerical models using finer horizontal grid resolution [O(1)km] can suffer from inconsistencies and neglected scale-dependencies in ABL parameterizations and non-resolved effects of integrated surface-subsurface lateral flow at this scale. Our present knowledge suggests large-eddy-simulation (LES) as an eventual solution to overcome the inadequacy of the physical parameterizations in the atmosphere in this transition scale, yet we are constrained by the computational resources, memory management, big-data, when using LES for regional domains. For the present, there is a need for scale-aware parameterizations not only in the atmosphere but also in the land surface and subsurface model components. In this study, we use the recently developed Terrestrial Systems Modeling Platform (TerrSysMP) as a numerical tool to analyze the uncertainty in the simulation of surface exchange fluxes and boundary layer circulations at grid resolutions of the order of 1km, and explore the sensitivity of the atmospheric boundary layer evolution and convective rainfall processes on land surface heterogeneity.

  9. Multiscale visual quality assessment for cluster analysis with self-organizing maps

    Science.gov (United States)

    Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias

    2011-01-01

    Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.

  10. Internet of Things-Based Arduino Intelligent Monitoring and Cluster Analysis of Seasonal Variation in Physicochemical Parameters of Jungnangcheon, an Urban Stream

    Directory of Open Access Journals (Sweden)

    Byungwan Jo

    2017-03-01

    Full Text Available In the present case study, the use of an advanced, efficient and low-cost technique for monitoring an urban stream was reported. Physicochemical parameters (PcPs of Jungnangcheon stream (Seoul, South Korea were assessed using an Internet of Things (IoT platform. Temperature, dissolved oxygen (DO, and pH parameters were monitored for the three summer months and the first fall month at a fixed location. Analysis was performed using clustering techniques (CTs, such as K-means clustering, agglomerative hierarchical clustering (AHC, and density-based spatial clustering of applications with noise (DBSCAN. An IoT-based Arduino sensor module (ASM network with a 99.99% efficient communication platform was developed to allow collection of stream data with user-friendly software and hardware and facilitated data analysis by interested individuals using their smartphones. Clustering was used to formulate relationships among physicochemical parameters. K-means clustering was used to identify natural clusters using the silhouette coefficient based on cluster compactness and looseness. AHC grouped all data into two clusters as well as temperature, DO and pH into four, eight, and four clusters, respectively. DBSCAN analysis was also performed to evaluate yearly variations in physicochemical parameters. Noise points (NOISE of temperature in 2016 were border points (ƥ, whereas in 2014 and 2015 they remained core points (ɋ, indicating a trend toward increasing stream temperature. We found the stream parameters were within the permissible limits set by the Water Quality Standards for River Water, South Korea.

  11. Steady state subchannel analysis of AHWR fuel cluster

    International Nuclear Information System (INIS)

    Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.

    2006-09-01

    Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)

  12. The JASMIN Analysis Platform - bridging the gap between traditional climate data practicies and data-centric analysis paradigms

    Science.gov (United States)

    Pascoe, Stephen; Iwi, Alan; kershaw, philip; Stephens, Ag; Lawrence, Bryan

    2014-05-01

    The advent of large-scale data and the consequential analysis problems have led to two new challenges for the research community: how to share such data to get the maximum value and how to carry out efficient analysis. Solving both challenges require a form of parallelisation: the first is social parallelisation (involving trust and information sharing), the second data parallelisation (involving new algorithms and tools). The JASMIN infrastructure supports both kinds of parallelism by providing a multi-tennent environment with petabyte-scale storage, VM provisioning and batch cluster facilities. The JASMIN Analysis Platform (JAP) is an analysis software layer for JASMIN which emphasises ease of transition from a researcher's local environment to JASMIN. JAP brings together tools traditionally used by multiple communities and configures them to work together, enabling users to move analysis from their local environment to JASMIN without rewriting code. JAP also provides facilities to exploit JASMIN's parallel capabilities whilst maintaining their familiar analysis environment where ever possible. Modern opensource analysis tools typically have multiple dependent packages, increasing the installation burden on system administrators. When you consider a suite of tools, often with both common and conflicting dependencies, analysis pipelines can become locked to a particular installation simply because of the effort required to reconstruct the dependency tree. JAP addresses this problem by providing a consistent suite of RPMs compatible with RedHat Enterprise Linux and CentOS 6.4. Researchers can install JAP locally, either as RPMs or through a pre-built VM image, giving them the confidence to know moving analysis to JASMIN will not disrupt their environment. Analysis parallelisation is in it's infancy in climate sciences, with few tools capable of exploiting any parallel environment beyond manual scripting of the use of multiple processors. JAP begins to bridge this

  13. Analysis of Learning Development With Sugeno Fuzzy Logic And Clustering

    Directory of Open Access Journals (Sweden)

    Maulana Erwin Saputra

    2017-06-01

    Full Text Available In the first journal, I made this attempt to analyze things that affect the achievement of students in each school of course vary. Because students are one of the goals of achieving the goals of successful educational organizations. The mental influence of students’ emotions and behaviors themselves in relation to learning performance. Fuzzy logic can be used in various fields as well as Clustering for grouping, as in Learning Development analyzes. The process will be performed on students based on the symptoms that exist. In this research will use fuzzy logic and clustering. Fuzzy is an uncertain logic but its excess is capable in the process of language reasoning so that in its design is not required complicated mathematical equations. However Clustering method is K-Means method is method where data analysis is broken down by group k (k = 1,2,3, .. k. To know the optimal number of Performance group. The results of the research is with a questionnaire entered into matlab will produce a value that means in generating the graph. And simplify the school in seeing Student performance in the learning process by using certain criteria. So from the system that obtained the results for a decision-making required by the school.

  14. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  15. Segmentation of Residential Gas Consumers Using Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Marta P. Fernandes

    2017-12-01

    Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.

  16. Photoproduced fluorescent Au(I)@(Ag2/Ag3)-thiolate giant cluster: an intriguing sensing platform for DMSO and Pb(II).

    Science.gov (United States)

    Ganguly, Mainak; Mondal, Chanchal; Jana, Jayasmita; Pal, Anjali; Pal, Tarasankar

    2014-01-14

    Synergistic evolution of fluorescent Au(I)@(Ag2/Ag3)-thiolate core-shell particles has been made possible under the Sun in presence of the respective precursor coinage metal compounds and glutathione (GSH). The green chemically synthesized fluorescent clusters are giant (∼600 nm) in size and robust. Among all the common water miscible solvents, exclusively DMSO exhibits selective fluorescence quenching (Turn Off) because of the removal of GSH from the giant cluster. Again, only Pb(II) ion brings back the lost fluorescence (Turn On) leaving aside all other metal ions. This happens owing to the strong affinity of the sulfur donor of DMSO for Pb(II). Thus, employing the aqueous solution containing the giant cluster, we can detect DMSO contamination in water bodies at trace level. Besides, a selective sensing platform has emerged out for Pb(II) ion with a detection limit of 14 × 10(-8) M. Pb(II) induced fluorescence recovery is again vanished by I(-) implying a promising route to sense I(-) ion.

  17. GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

    Science.gov (United States)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-08-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.

  18. Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

    Science.gov (United States)

    Muraoka, Masae; Okuda, Hiroshi

    With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.

  19. Cluster analysis in systems of magnetic spheres and cubes

    Science.gov (United States)

    Pyanzina, E. S.; Gudkova, A. V.; Donaldson, J. G.; Kantorovich, S. S.

    2017-06-01

    In the present work we use molecular dynamics simulations and graph-theory based cluster analysis to compare self-assembly in systems of magnetic spheres, and cubes where the dipole moment is oriented along the side of the cube in the [001] crystallographic direction. We show that under the same conditions cubes aggregate far less than their spherical counterparts. This difference can be explained in terms of the volume of phase space in which the formation of the bond is thermodynamically advantageous. It follows that this volume is much larger for a dipolar sphere than for a dipolar cube.

  20. Cluster analysis of activity-time series in motor learning

    DEFF Research Database (Denmark)

    Balslev, Daniela; Nielsen, Finn Årup; Frutiger, Sally A.

    2002-01-01

    Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel...... practice-related activity in a fronto-parieto-cerebellar network, in agreement with previous studies of motor learning. These voxels were separated from a group of voxels showing an unspecific time-effect and another group of voxels, whose activation was an artifact from smoothing. Hum. Brain Mapping 15...

  1. RETINOBASE: a web database, data mining and analysis platform for gene expression data on retina

    Directory of Open Access Journals (Sweden)

    Léveillard Thierry

    2008-05-01

    Full Text Available Abstract Background The retina is a multi-layered sensory tissue that lines the back of the eye and acts at the interface of input light and visual perception. Its main function is to capture photons and convert them into electrical impulses that travel along the optic nerve to the brain where they are turned into images. It consists of neurons, nourishing blood vessels and different cell types, of which neural cells predominate. Defects in any of these cells can lead to a variety of retinal diseases, including age-related macular degeneration, retinitis pigmentosa, Leber congenital amaurosis and glaucoma. Recent progress in genomics and microarray technology provides extensive opportunities to examine alterations in retinal gene expression profiles during development and diseases. However, there is no specific database that deals with retinal gene expression profiling. In this context we have built RETINOBASE, a dedicated microarray database for retina. Description RETINOBASE is a microarray relational database, analysis and visualization system that allows simple yet powerful queries to retrieve information about gene expression in retina. It provides access to gene expression meta-data and offers significant insights into gene networks in retina, resulting in better hypothesis framing for biological problems that can subsequently be tested in the laboratory. Public and proprietary data are automatically analyzed with 3 distinct methods, RMA, dChip and MAS5, then clustered using 2 different K-means and 1 mixture models method. Thus, RETINOBASE provides a framework to compare these methods and to optimize the retinal data analysis. RETINOBASE has three different modules, "Gene Information", "Raw Data System Analysis" and "Fold change system Analysis" that are interconnected in a relational schema, allowing efficient retrieval and cross comparison of data. Currently, RETINOBASE contains datasets from 28 different microarray experiments performed

  2. A cluster analysis on road traffic accidents using genetic algorithms

    Science.gov (United States)

    Saharan, Sabariah; Baragona, Roberto

    2017-04-01

    The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.

  3. Parallel application of plasma equilibrium fitting based on inhomogeneous platforms

    International Nuclear Information System (INIS)

    Liao Min; Zhang Jinhua; Chen Liaoyuan; Li Yongge; Pan Wei; Pan Li

    2008-01-01

    An online analysis and online display platform EFIT, which is based on the equilibrium-fitting mode, is inducted in this paper. This application can realize large data transportation between inhomogeneous platforms by designing a communication mechanism using sockets. It spends approximately one minute to complete the equilibrium fitting reconstruction by using a finite state machine to describe the management node and several node computers of cluster system to fulfill the parallel computation, this satisfies the online display during the discharge interval. An effective communication model between inhomogeneous platforms is provided, which could transport the computing results from Linux platform to Windows platform for online analysis and display. (authors)

  4. New Structural Representation and Digital-Analysis Platform for Symmetrical Parallel Mechanisms

    Directory of Open Access Journals (Sweden)

    Wenao Cao

    2013-05-01

    Full Text Available Abstract An automatic design platform capable of automatic structural analysis, structural synthesis and the application of parallel mechanisms will be a great aid in the conceptual design of mechanisms, though up to now such a platform has only existed as an idea. The work in this paper constitutes part of such a platform. Based on the screw theory and a new structural representation method proposed here which builds a one-to-one correspondence between the strings of representative characters and the kinematic structures of symmetrical parallel mechanisms (SPMs, this paper develops a fully-automatic approach for mobility (degree-of-freedom analysis, and further establishes an automatic digital-analysis platform for SPMs. With this platform, users simply have to enter the strings of representative characters, and the kinematic structures of the SPMs will be generated and displayed automatically, and the mobility and its properties will also be analysed and displayed automatically. Typical examples are provided to show the effectiveness of the approach.

  5. GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

    Directory of Open Access Journals (Sweden)

    Raquel L. Costa

    2017-07-01

    Full Text Available There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were

  6. An electromagnetic signals monitoring and analysis wireless platform employing personal digital assistants and pattern analysis techniques

    Science.gov (United States)

    Ninos, K.; Georgiadis, P.; Cavouras, D.; Nomicos, C.

    2010-05-01

    This study presents the design and development of a mobile wireless platform to be used for monitoring and analysis of seismic events and related electromagnetic (EM) signals, employing Personal Digital Assistants (PDAs). A prototype custom-developed application was deployed on a 3G enabled PDA that could connect to the FTP server of the Institute of Geodynamics of the National Observatory of Athens and receive and display EM signals at 4 receiver frequencies (3 KHz (E-W, N-S), 10 KHz (E-W, N-S), 41 MHz and 46 MHz). Signals may originate from any one of the 16 field-stations located around the Greek territory. Employing continuous recordings of EM signals gathered from January 2003 till December 2007, a Support Vector Machines (SVM)-based classification system was designed to distinguish EM precursor signals within noisy background. EM-signals corresponding to recordings preceding major seismic events (Ms≥5R) were segmented, by an experienced scientist, and five features (mean, variance, skewness, kurtosis, and a wavelet based feature), derived from the EM-signals were calculated. These features were used to train the SVM-based classification scheme. The performance of the system was evaluated by the exhaustive search and leave-one-out methods giving 87.2% overall classification accuracy, in correctly identifying EM precursor signals within noisy background employing all calculated features. Due to the insufficient processing power of the PDAs, this task was performed on a typical desktop computer. This optimal trained context of the SVM classifier was then integrated in the PDA based application rendering the platform capable to discriminate between EM precursor signals and noise. System's efficiency was evaluated by an expert who reviewed 1/ multiple EM-signals, up to 18 days prior to corresponding past seismic events, and 2/ the possible EM-activity of a specific region employing the trained SVM classifier. Additionally, the proposed architecture can form a

  7. Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

    International Nuclear Information System (INIS)

    Zeng, J.; Li, G.; Sun, J.

    2013-01-01

    Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)

  8. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

    Science.gov (United States)

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

    2014-07-01

    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  9. [Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].

    Science.gov (United States)

    Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés

    2018-03-06

    To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  10. Reliability analysis of cluster-based ad-hoc networks

    International Nuclear Information System (INIS)

    Cook, Jason L.; Ramirez-Marquez, Jose Emmanuel

    2008-01-01

    The mobile ad-hoc wireless network (MAWN) is a new and emerging network scheme that is being employed in a variety of applications. The MAWN varies from traditional networks because it is a self-forming and dynamic network. The MAWN is free of infrastructure and, as such, only the mobile nodes comprise the network. Pairs of nodes communicate either directly or through other nodes. To do so, each node acts, in turn, as a source, destination, and relay of messages. The virtue of a MAWN is the flexibility this provides; however, the challenge for reliability analyses is also brought about by this unique feature. The variability and volatility of the MAWN configuration makes typical reliability methods (e.g. reliability block diagram) inappropriate because no single structure or configuration represents all manifestations of a MAWN. For this reason, new methods are being developed to analyze the reliability of this new networking technology. New published methods adapt to this feature by treating the configuration probabilistically or by inclusion of embedded mobility models. This paper joins both methods together and expands upon these works by modifying the problem formulation to address the reliability analysis of a cluster-based MAWN. The cluster-based MAWN is deployed in applications with constraints on networking resources such as bandwidth and energy. This paper presents the problem's formulation, a discussion of applicable reliability metrics for the MAWN, and illustration of a Monte Carlo simulation method through the analysis of several example networks

  11. Shape Analysis of HII Regions - I. Statistical Clustering

    Science.gov (United States)

    Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

    2018-04-01

    We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.

  12. Time series clustering analysis of health-promoting behavior

    Science.gov (United States)

    Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

    2013-10-01

    Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

  13. Validation of tumor protein marker quantification by two independent automated immunofluorescence image analysis platforms

    Science.gov (United States)

    Peck, Amy R; Girondo, Melanie A; Liu, Chengbao; Kovatich, Albert J; Hooke, Jeffrey A; Shriver, Craig D; Hu, Hai; Mitchell, Edith P; Freydin, Boris; Hyslop, Terry; Chervoneva, Inna; Rui, Hallgeir

    2016-01-01

    Protein marker levels in formalin-fixed, paraffin-embedded tissue sections traditionally have been assayed by chromogenic immunohistochemistry and evaluated visually by pathologists. Pathologist scoring of chromogen staining intensity is subjective and generates low-resolution ordinal or nominal data rather than continuous data. Emerging digital pathology platforms now allow quantification of chromogen or fluorescence signals by computer-assisted image analysis, providing continuous immunohistochemistry values. Fluorescence immunohistochemistry offers greater dynamic signal range than chromogen immunohistochemistry, and combined with image analysis holds the promise of enhanced sensitivity and analytic resolution, and consequently more robust quantification. However, commercial fluorescence scanners and image analysis software differ in features and capabilities, and claims of objective quantitative immunohistochemistry are difficult to validate as pathologist scoring is subjective and there is no accepted gold standard. Here we provide the first side-by-side validation of two technologically distinct commercial fluorescence immunohistochemistry analysis platforms. We document highly consistent results by (1) concordance analysis of fluorescence immunohistochemistry values and (2) agreement in outcome predictions both for objective, data-driven cutpoint dichotomization with Kaplan–Meier analyses or employment of continuous marker values to compute receiver-operating curves. The two platforms examined rely on distinct fluorescence immunohistochemistry imaging hardware, microscopy vs line scanning, and functionally distinct image analysis software. Fluorescence immunohistochemistry values for nuclear-localized and tyrosine-phosphorylated Stat5a/b computed by each platform on a cohort of 323 breast cancer cases revealed high concordance after linear calibration, a finding confirmed on an independent 382 case cohort, with concordance correlation coefficients >0

  14. Cluster, adaptation and extroversion : a cognitive and entrepreneurial analysis of the Marche music cluster

    NARCIS (Netherlands)

    Tappi, D.

    2005-01-01

    Over recent decades, clusters like industrial districts have increasingly attracted attention in economic debate. The study of clusters, particularly in the Italian literature, highlights the inadequacy of the mainstream body of explanation to provide a theory of the emergence and transformation

  15. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  16. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  17. Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

    Science.gov (United States)

    Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

    2017-06-01

    Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.

  18. Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

    International Nuclear Information System (INIS)

    Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.

    2014-01-01

    For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)

  19. Sensitization trajectories in childhood revealed by using a cluster analysis

    DEFF Research Database (Denmark)

    Schoos, Ann-Marie M.; Chawes, Bo L.; Melen, Erik

    2017-01-01

    Prospective Studies on Asthma in Childhood 2000 (COPSAC2000) birth cohort with specific IgE against 13 common food and inhalant allergens at the ages of ½, 1½, 4, and 6 years. An unsupervised cluster analysis for 3-dimensional data (nonnegative sparse parallel factor analysis) was used to extract latent......BACKGROUND: Assessment of sensitization at a single time point during childhood provides limited clinical information. We hypothesized that sensitization develops as specific patterns with respect to age at debut, development over time, and involved allergens and that such patterns might be more...... biologically and clinically relevant. OBJECTIVE: We sought to explore latent patterns of sensitization during the first 6 years of life and investigate whether such patterns associate with the development of asthma, rhinitis, and eczema. METHODS: We investigated 398 children from the at-risk Copenhagen...

  20. Quest for Highly-connected MOF Platforms: Rare-Earth Polynuclear Clusters Versatility Meets Net Topology Needs.

    KAUST Repository

    Alezi, Dalal

    2015-04-07

    Gaining control over the assembly of highly porous rare-earth (RE) based metal-organic frameworks (MOFs) remains challenging. Here we report the latest discoveries on our continuous quest for highly-connected nets. The topological exploration based on the non-compatibility of 12-connected RE polynuclear carboxylate-based cluster, points of extension matching the 12 vertices of the cuboctahedron (cuo), with 3-connected organic ligands led to the discovery of two fascinating and highly-connected minimal edge-transitive nets, pek and aea. The reduced symmetry of the employed triangular tricarboxylate ligand, as compared to the prototype highly symmetrical 1,3,5-benzene(tris)benzoic acid guided the concurrent occurrence of nonanuclear [RE9(μ3-OH)12(μ3-O)2(O2C–)12] and hexanuclear [RE6(OH)8(O2C–)8] carboxylate-based clusters as 12-connected and 8-connected molecular building blocks in the structure of a 3-periodic pek-MOF based on a novel (3,8,12)-c trinodal net. The use of a tricarboxylate ligand with modified angles between carboxylate moieties led to the formation of a second MOF containing solely nonanuclear clusters and exhibiting once more a novel and a highly-connected (3,12,12)-c trinodal net with aea topology. Notably, it is the first time that RE-MOFs with double six-membered ring (d6R) secondary building units are isolated, representing therefore a critical step forward toward the design of novel and highly coordinated materials using the supermolecular building layer approach while considering the d6Rs as building pillars. Lastly, the potential of these new MOFs for gas separation/storage was investigated by performing gas adsorption studies of various probe gas molecules over a wide range of pressures. Noticeably, pek-MOF-1 showed excellent volumetric CO2 and CH4 uptakes at high pressures.

  1. BlockSci: Design and applications of a blockchain analysis platform

    OpenAIRE

    Kalodner, Harry; Goldfeder, Steven; Chator, Alishah; Möser, Malte; Narayanan, Arvind

    2017-01-01

    Analysis of blockchain data is useful for both scientific research and commercial applications. We present BlockSci, an open-source software platform for blockchain analysis. BlockSci is versatile in its support for different blockchains and analysis tasks. It incorporates an in-memory, analytical (rather than transactional) database, making it several hundred times faster than existing tools. We describe BlockSci's design and present four analyses that illustrate its capabilities. This is a ...

  2. Integrating PROOF Analysis in Cloud and Batch Clusters

    International Nuclear Information System (INIS)

    Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto

    2012-01-01

    High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.

  3. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi

    2013-09-01

    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  4. Organ-on-a-Chip: New Platform for Biological Analysis

    Directory of Open Access Journals (Sweden)

    Fan An

    2015-01-01

    Full Text Available Direct detection and analysis of biomolecules and cells in physiological microenvironment is urgently needed for fast evaluation of biology and pharmacy. The past several years have witnessed remarkable development opportunities in vitro organs and tissues models with multiple functions based on microfluidic devices, termed as “organ-on-a-chip”. Briefly speaking, it is a promising technology in rebuilding physiological functions of tissues and organs, featuring mammalian cell co-culture and artificial microenvironment created by microchannel networks. In this review, we summarized the advances in studies of heart-, vessel-, liver-, neuron-, kidney- and Multi-organs-on-a-chip, and discussed some noteworthy potential on-chip detection schemes.

  5. Parameterized Analysis of 2-DOF Motion Platform Based on ADAMS

    Directory of Open Access Journals (Sweden)

    Hu Hanyuan

    2015-01-01

    Full Text Available According to the functions of parametric modeling and analysis from ADAMS, this thesis was established a parametric simulation model in order to optimize the rated output power of electric cylinders according to the real field environment. First, the variable which could affect sensitivity of the output variables was chosen by the electric cylinder’s elongation which was obtained through loop vector method. Then this thesis tried to get the optimum optimization design parameters through the simulation, and the change of rated output power affected by the change of parameters, meanwhile, made a filter and calibration of parameters which have greater influence on sensibilities. The goal of design could meet the qualification with less work load and faster speed. It is concluded that the change of the location parameters affects the rated output power.

  6. Identifying Two Groups of Entitled Individuals: Cluster Analysis Reveals Emotional Stability and Self-Esteem Distinction.

    Science.gov (United States)

    Crowe, Michael L; LoPilato, Alexander C; Campbell, W Keith; Miller, Joshua D

    2016-12-01

    The present study hypothesized that there exist two distinct groups of entitled individuals: grandiose-entitled, and vulnerable-entitled. Self-report scores of entitlement were collected for 916 individuals using an online platform. Model-based cluster analyses were conducted on the individuals with scores one standard deviation above mean (n = 159) using the five-factor model dimensions as clustering variables. The results support the existence of two groups of entitled individuals categorized as emotionally stable and emotionally vulnerable. The emotionally stable cluster reported emotional stability, high self-esteem, more positive affect, and antisocial behavior. The emotionally vulnerable cluster reported low self-esteem and high levels of neuroticism, disinhibition, conventionality, psychopathy, negative affect, childhood abuse, intrusive parenting, and attachment difficulties. Compared to the control group, both clusters reported being more antagonistic, extraverted, Machiavellian, and narcissistic. These results suggest important differences are missed when simply examining the linear relationships between entitlement and various aspects of its nomological network.

  7. Cluster analysis of received constellations for optical performance monitoring

    NARCIS (Netherlands)

    van Weerdenburg, J.J.A.; van Uden, R.; Sillekens, E.; de Waardt, H.; Koonen, A.M.J.; Okonkwo, C.

    2016-01-01

    Performance monitoring based on centroid clustering to investigate constellation generation offsets. The tool allows flexibility in constellation generation tolerances by forwarding centroids to the demapper. The relation of fibre nonlinearities and singular value decomposition of intra-cluster

  8. Multivariate Gradient Analysis for Evaluating and Visualizing a Learning System Platform for Computer Programming

    Science.gov (United States)

    Mather, Richard

    2015-01-01

    This paper explores the application of canonical gradient analysis to evaluate and visualize student performance and acceptance of a learning system platform. The subject of evaluation is a first year BSc module for computer programming. This uses "Ceebot," an animated and immersive game-like development environment. Multivariate…

  9. Modelling and analysis of offshore energy systems on North Sea oil and gas platforms

    DEFF Research Database (Denmark)

    Nguyen, Tuong-Van; Elmegaard, Brian; Pierobon, Leonardo

    2012-01-01

    export, and power generation. In this paper, a generic model of a North Sea oil and gas platform is described and the most thermodynamically inefficient processes are identified by performing an exergy analysis. Models and simulations are built and run with the tools Aspen Plus R, DNA and Aspen HYSYS R...

  10. A Cross-Platform Infrastructure for Scalable Runtime Application Performance Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Jack Dongarra; Shirley Moore; Bart Miller, Jeffrey Hollingsworth; Tracy Rafferty

    2005-03-15

    The purpose of this project was to build an extensible cross-platform infrastructure to facilitate the development of accurate and portable performance analysis tools for current and future high performance computing (HPC) architectures. Major accomplishments include tools and techniques for multidimensional performance analysis, as well as improved support for dynamic performance monitoring of multithreaded and multiprocess applications. Previous performance tool development has been limited by the burden of having to re-write a platform-dependent low-level substrate for each architecture/operating system pair in order to obtain the necessary performance data from the system. Manual interpretation of performance data is not scalable for large-scale long-running applications. The infrastructure developed by this project provides a foundation for building portable and scalable performance analysis tools, with the end goal being to provide application developers with the information they need to analyze, understand, and tune the performance of terascale applications on HPC architectures. The backend portion of the infrastructure provides runtime instrumentation capability and access to hardware performance counters, with thread-safety for shared memory environments and a communication substrate to support instrumentation of multiprocess and distributed programs. Front end interfaces provides tool developers with a well-defined, platform-independent set of calls for requesting performance data. End-user tools have been developed that demonstrate runtime data collection, on-line and off-line analysis of performance data, and multidimensional performance analysis. The infrastructure is based on two underlying performance instrumentation technologies. These technologies are the PAPI cross-platform library interface to hardware performance counters and the cross-platform Dyninst library interface for runtime modification of executable images. The Paradyn and KOJAK

  11. The composite sequential clustering technique for analysis of multispectral scanner data

    Science.gov (United States)

    Su, M. Y.

    1972-01-01

    The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.

  12. Genetic Diversity and Relationships of Neolamarckia cadamba (Roxb. Bosser progenies through cluster analysis

    Directory of Open Access Journals (Sweden)

    M. Preethi Shree

    2018-04-01

    Full Text Available Genetic diversity analysis was conducted for biometric attributes in 20 progenies of Neolamarckia cadamba. The application of D2 clustering technique in Neolamarckia cadamba genetic resources resolved the 20 progenies into five clusters. The maximum intra cluster distance was shown by the cluster II. The maximum inter cluster distance was recorded between cluster III and V which indicated the presence of wider genetic distance between Neolamarckia cadamba progenies. Among the growth attributes, volume (36.84 % contributed maximum towards genetic divergence followed by bole height, basal diameter, tree height, number of branches in Neolamarckia cadamba progenies.

  13. BioFoV - An open platform for forensic video analysis and biometric data extraction

    DEFF Research Database (Denmark)

    Almeida, Miguel; Correia, Paulo Lobato; Larsen, Peter Kastmand

    2016-01-01

    to tailor-made software, based on state of art knowledge in fields such as soft biometrics, gait recognition, photogrammetry, etc. This paper proposes an open and extensible platform, BioFoV (Biometric Forensic Video tool), for forensic video analysis and biometric data extraction, aiming to host some...... of the developments that researchers come up with for solving specific problems, but that are often not shared with the community. BioFoV includes a simple to use Graphical User Interface (GUI), is implemented with open software that can run in multiple software platforms, and its implementation is publicly available....

  14. SVIP-N 1.0: An integrated visualization platform for neutronics analysis

    International Nuclear Information System (INIS)

    Luo Yuetong; Long Pengcheng; Wu Guoyong; Zeng Qin; Hu Liqin; Zou Jun

    2010-01-01

    Post-processing is an important part of neutronics analysis, and SVIP-N 1.0 (scientific visualization integrated platform for neutronics analysis) is designed to ease post-processing of neutronics analysis through visualization technologies. Main capabilities of SVIP-N 1.0 include: (1) ability of manage neutronics analysis result; (2) ability to preprocess neutronics analysis result; (3) ability to visualization neutronics analysis result data in different way. The paper describes the system architecture and main features of SVIP-N, some advanced visualization used in SVIP-N 1.0 and some preliminary applications, such as ITER.

  15. QTL global meta-analysis: are trait determining genes clustered?

    Directory of Open Access Journals (Sweden)

    Adelson David L

    2009-04-01

    Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.

  16. Cluster decay analysis and related structure effects of fissionable ...

    Indian Academy of Sciences (India)

    2015-08-01

    Aug 1, 2015 ... Collective clusterization approach of dynamical cluster decay model (DCM) has been ... fusion–fission process resulting in the emission of symmetric and/or ... represents the relative separation distance between two fragments or clusters ... decay constant λ or decay half-life T1/2 is defined as λ = (ln 2/T1/2) ...

  17. Maximum-entropy clustering algorithm and its global convergence analysis

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Constructing a batch of differentiable entropy functions touniformly approximate an objective function by means of the maximum-entropy principle, a new clustering algorithm, called maximum-entropy clustering algorithm, is proposed based on optimization theory. This algorithm is a soft generalization of the hard C-means algorithm and possesses global convergence. Its relations with other clustering algorithms are discussed.

  18. Hybrid Tracking Algorithm Improvements and Cluster Analysis Methods.

    Science.gov (United States)

    1982-02-26

    UPGMA ), and Ward’s method. Ling’s papers describe a (k,r) clustering method. Each of these methods have individual characteristics which make them...Reference 7), UPGMA is probably the most frequently used clustering strategy. UPGMA tries to group new points into an existing cluster by using an

  19. Use of the NetBeans Platform for NASA Robotic Conjunction Assessment Risk Analysis

    Science.gov (United States)

    Sabey, Nickolas J.

    2014-01-01

    The latest Java and JavaFX technologies are very attractive software platforms for customers involved in space mission operations such as those of NASA and the US Air Force. For NASA Robotic Conjunction Assessment Risk Analysis (CARA), the NetBeans platform provided an environment in which scalable software solutions could be developed quickly and efficiently. Both Java 8 and the NetBeans platform are in the process of simplifying CARA development in secure environments by providing a significant amount of capability in a single accredited package, where accreditation alone can account for 6-8 months for each library or software application. Capabilities either in use or being investigated by CARA include: 2D and 3D displays with JavaFX, parallelization with the new Streams API, and scalability through the NetBeans plugin architecture.

  20. CHOOSING A HEALTH INSTITUTION WITH MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS IN A POPULATION BASED STUDY

    Directory of Open Access Journals (Sweden)

    ASLI SUNER

    2013-06-01

    Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.

  1. MMPI profiles of males accused of severe crimes: a cluster analysis

    NARCIS (Netherlands)

    Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.

    2009-01-01

    In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious

  2. Cluster analysis of rural, urban, and curbside atmospheric particle size data.

    Science.gov (United States)

    Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

    2009-07-01

    Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.

  3. Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

    Science.gov (United States)

    Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

    2015-09-01

    We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. A Multi-Science Data Analysis Platform and the GeneROOT Use Case

    CERN Multimedia

    CERN. Geneva; Rademakers, Fons

    2017-01-01

    This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.

  5. Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling

    Science.gov (United States)

    Medina, Ignacio; Carbonell, José; Pulido, Luis; Madeira, Sara C.; Goetz, Stefan; Conesa, Ana; Tárraga, Joaquín; Pascual-Montano, Alberto; Nogales-Cadenas, Ruben; Santoyo, Javier; García, Francisco; Marbà, Martina; Montaner, David; Dopazo, Joaquín

    2010-01-01

    Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein–protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org. PMID:20478823

  6. The relationship between supplier networks and industrial clusters: an analysis based on the cluster mapping method

    Directory of Open Access Journals (Sweden)

    Ichiro IWASAKI

    2010-06-01

    Full Text Available Michael Porter’s concept of competitive advantages emphasizes the importance of regional cooperation of various actors in order to gain competitiveness on globalized markets. Foreign investors may play an important role in forming such cooperation networks. Their local suppliers tend to concentrate regionally. They can form, together with local institutions of education, research, financial and other services, development agencies, the nucleus of cooperative clusters. This paper deals with the relationship between supplier networks and clusters. Two main issues are discussed in more detail: the interest of multinational companies in entering regional clusters and the spillover effects that may stem from their participation. After the discussion on the theoretical background, the paper introduces a relatively new analytical method: “cluster mapping” - a method that can spot regional hot spots of specific economic activities with cluster building potential. Experience with the method was gathered in the US and in the European Union. After the discussion on the existing empirical evidence, the authors introduce their own cluster mapping results, which they obtained by using a refined version of the original methodology.

  7. Higgs Pair Production: Choosing Benchmarks With Cluster Analysis

    CERN Document Server

    Carvalho, Alexandra; Dorigo, Tommaso; Goertz, Florian; Gottardo, Carlo A.; Tosi, Mia

    2016-01-01

    New physics theories often depend on a large number of free parameters. The precise values of those parameters in some cases drastically affect the resulting phenomenology of fundamental physics processes, while in others finite variations can leave it basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics of different models; a clustering algorithm using that metric may then allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmark points are then guaranteed to be sensitive to a large area of the parameter space. In this doc...

  8. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2005-11-01

    Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

  9. Performance analysis of clustering techniques over microarray data: A case study

    Science.gov (United States)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  10. Advances in the development of the Mexican platform for analysis and design of nuclear reactors: AZTLAN Platform; Avances en el desarrollo de la plataforma mexicana para analisis y diseno de reactores nucleares: AZTLAN Platform

    Energy Technology Data Exchange (ETDEWEB)

    Gomez T, A. M.; Puente E, F. [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico); Del Valle G, E. [IPN, Escuela Superior de Fisica y Matematicas, Av. IPN s/n, 07738 Ciudad de Mexico (Mexico); Francois L, J. L. [UNAM, Facultad de Ingenieria, Departamento de Sistemas Energeticos, Paseo Cuauhnahuac 8532, Col. Progreso, 62550 Jiutepec, Morelos (Mexico); Espinosa P, G., E-mail: armando.gomez@inin.gob.mx [Universidad Autonoma Metropolitana, Unidad Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, 09340 Ciudad de Mexico (Mexico)

    2017-09-15

    The AZTLAN platform project: development of a Mexican platform for the analysis and design of nuclear reactors, financed by the SENER-CONACYT Energy Sustain ability Fund, was approved in early 2014 and formally began at the end of that year. It is a national project led by the Instituto Nacional de Investigaciones Nucleares (ININ) and with the collaboration of Instituto Politecnico Nacional (IPN), the Universidad Autonoma Metropolitana (UAM) and Universidad Nacional Autonoma de Mexico (UNAM) as part of the development team and with the participation of the Laguna Verde Nuclear Power Plant, the National Commission of Nuclear Safety and Safeguards, the Ministry of Energy and the Karlsruhe Institute of Technology (Kit, Germany) as part of the user group. The general objective of the project is to modernize, improve and integrate the neutronic, thermo-hydraulic and thermo-mechanical codes, developed in Mexican institutions, in an integrated platform, developed and maintained by Mexican experts for the benefit of Mexican institutions. Two years into the process, important steps have been taken that have consolidated the platform. The main results of these first two years have been presented in different national and international forums. In this congress, some of the most recent results that have been implemented in the platform codes are shown in more detail. The current status of the platform from a more executive view point is summarized in this paper. (Author)

  11. Depth data research of GIS based on clustering analysis algorithm

    Science.gov (United States)

    Xiong, Yan; Xu, Wenli

    2018-03-01

    The data of GIS have spatial distribution. Geographic data has both spatial characteristics and attribute characteristics, and also changes with time. Therefore, the amount of data is very large. Nowadays, many industries and departments in the society are using GIS. However, without proper data analysis and mining scheme, GIS will not exert its maximum effectiveness and will waste a lot of data. In this paper, we use the geographic information demand of a national security department as the experimental object, combining the characteristics of GIS data, taking into account the characteristics of time, space, attributes and so on, and using cluster analysis algorithm. We further study the mining scheme for depth data, and get the algorithm model. This algorithm can automatically classify sample data, and then carry out exploratory analysis. The research shows that the algorithm model and the information mining scheme can quickly find hidden depth information from the surface data of GIS, thus improving the efficiency of the security department. This algorithm can also be extended to other fields.

  12. Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

    Directory of Open Access Journals (Sweden)

    Marco Borri

    Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.

  13. Analysis of the dynamical cluster approximation for the Hubbard model

    OpenAIRE

    Aryanpour, K.; Hettler, M. H.; Jarrell, M.

    2002-01-01

    We examine a central approximation of the recently introduced Dynamical Cluster Approximation (DCA) by example of the Hubbard model. By both analytical and numerical means we study non-compact and compact contributions to the thermodynamic potential. We show that approximating non-compact diagrams by their cluster analogs results in a larger systematic error as compared to the compact diagrams. Consequently, only the compact contributions should be taken from the cluster, whereas non-compact ...

  14. X-Ray Morphological Analysis of the Planck ESZ Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Andrade-Santos, Felipe; Randall, Scott; Kraft, Ralph [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Ettori, Stefano [INAF, Osservatorio Astronomico di Bologna, via Ranzani 1, I-40127 Bologna (Italy); Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DRF—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France)

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  15. Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

    Science.gov (United States)

    Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

    2017-10-01

    Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  16. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  17. ANALYSIS OF DEVELOPING BATIK INDUSTRY CLUSTER IN BAKARAN VILLAGE CENTRAL JAVA PROVINCE

    Directory of Open Access Journals (Sweden)

    Hermanto Hermanto

    2017-06-01

    Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.

  18. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Hybrid Microfluidic Platform for Multifactorial Analysis Based on Electrical Impedance, Refractometry, Optical Absorption and Fluorescence

    Directory of Open Access Journals (Sweden)

    Fábio M. Pereira

    2016-10-01

    Full Text Available This paper describes the development of a novel microfluidic platform for multifactorial analysis integrating four label-free detection methods: electrical impedance, refractometry, optical absorption and fluorescence. We present the rationale for the design and the details of the microfabrication of this multifactorial hybrid microfluidic chip. The structure of the platform consists of a three-dimensionally patterned polydimethylsiloxane top part attached to a bottom SU-8 epoxy-based negative photoresist part, where microelectrodes and optical fibers are incorporated to enable impedance and optical analysis. As a proof of concept, the chip functions have been tested and explored, enabling a diversity of applications: (i impedance-based identification of the size of micro beads, as well as counting and distinguishing of erythrocytes by their volume or membrane properties; (ii simultaneous determination of the refractive index and optical absorption properties of solutions; and (iii fluorescence-based bead counting.

  20. Multi-platform ’Omics Analysis of Human Ebola Virus Disease Pathogenesis

    Energy Technology Data Exchange (ETDEWEB)

    Eisfeld, Amie J.; Halfmann, Peter J.; Wendler, Jason P.; Kyle, Jennifer E.; Burnum-Johnson, Kristin E.; Peralta, Zuleyma; Maemura, Tadashi; Walters, Kevin B.; Watanabe, Tokiko; Fukuyama, Satoshi; Yamashita, Makoto; Jacobs, Jon M.; Kim, Young-Mo; Casey, Cameron P.; Stratton, Kelly G.; Webb-Robertson, Bobbie-Jo M.; Gritsenko, Marina A.; Monroe, Matthew E.; Weitz, Karl K.; Shukla, Anil K.; Tian, Mingyuan; Neumann, Gabriele; Reed, Jennifer L.; van Bakel, Harm; Metz, Thomas O.; Smith, Richard D.; Waters, Katrina M.; N' jai, Alhaji; Sahr, Foday; Kawaoka, Yoshihiro

    2017-12-01

    The pathogenesis of human Ebola virus disease (EVD) is complex. EVD is characterized by high levels of virus replication and dissemination, dysregulated immune responses, extensive virus- and host-mediated tissue damage, and disordered coagulation. To clarify how host responses contribute to EVD pathophysiology, we performed multi-platform ’omics analysis of peripheral blood mononuclear cells and plasma from EVD patients. Our results indicate that EVD molecular signatures overlap with those of sepsis, imply that pancreatic enzymes contribute to tissue damage in fatal EVD, and suggest that Ebola virus infection may induce aberrant neutrophils whose activity could explain hallmarks of fatal EVD. Moreover, integrated biomarker prediction identified putative biomarkers from different data platforms that differentiated survivors and fatalities early after infection. This work reveals insight into EVD pathogenesis, suggests an effective approach for biomarker identification, and provides an important community resource for further analysis of human EVD severity.

  1. Cluster analysis of HZE particle tracks as applied to space radiobiology problems

    International Nuclear Information System (INIS)

    Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.

    2013-01-01

    A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer

  2. Application of cluster analysis and unsupervised learning to multivariate tissue characterization

    International Nuclear Information System (INIS)

    Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.

    1987-01-01

    This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data

  3. Participant intimacy: A cluster analysis of the intranuclear cascade

    International Nuclear Information System (INIS)

    Cugnon, J.; Knoll, J.; Randrup, J.

    1981-01-01

    The intranuclear cascade for relativistic nuclear collisions is analyzed in terms of clusters consisting of groups of nucleons which are dynamically linked to each other by violent interactions. The formation cross sections for the different cluster types as well as their intrinsic dynamics are studied and compared with the predictions of the linear cascade model ( rows-on-rows ). (orig.)

  4. An evaluation of centrality measures used in cluster analysis

    Science.gov (United States)

    Engström, Christopher; Silvestrov, Sergei

    2014-12-01

    Clustering of data into groups of similar objects plays an important part when analysing many types of data, especially when the datasets are large as they often are in for example bioinformatics, social networks and computational linguistics. Many clustering algorithms such as K-means and some types of hierarchical clustering need a number of centroids representing the 'center' of the clusters. The choice of centroids for the initial clusters often plays an important role in the quality of the clusters. Since a data point with a high centrality supposedly lies close to the 'center' of some cluster, this can be used to assign centroids rather than through some other method such as picking them at random. Some work have been done to evaluate the use of centrality measures such as degree, betweenness and eigenvector centrality in clustering algorithms. The aim of this article is to compare and evaluate the usefulness of a number of common centrality measures such as the above mentioned and others such as PageRank and related measures.

  5. Building the Future Air Force: Analysis of Platform versus Weapon Development

    Science.gov (United States)

    2016-05-26

    fight a limited war in Southeast Asia . Despite its relatively poor record in aerial combat, the F-105 conducted most of the difficult work in the...PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Sd. PROJECT NUMBER Maj Ryan W. Ellis Se. TASK NUMBER Sf. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND...Specifically, the analysis assesses platform survivability, flexibility , and payload; and weapon yield and precision. The monograph reveals the lag

  6. Security analysis of Aspiro Music Platform, a digital music streaming service

    OpenAIRE

    Kachanovskiy, Roman

    2010-01-01

    The report is mainly based on recommendations given by the National Institute of Standards and Technology in special publication 800-30 ``Risk Management Guide for Information Technology Systems''. The risk analysis presented in this report emphasizes a qualitative approach. Firstly, the security requirements for Aspiro Music Platform were identified and classified by the level of importance. Secondly, potential threats to the system were discussed. In the next step the potential system ...

  7. A cost and operational effectiveness analysis of alternative anti-surface warfare platforms

    OpenAIRE

    Skinner, Walter Mark.

    1993-01-01

    Approved for public release; distribution is unlimited. A Cost and Operational Effectiveness Analysis (COEA) is performed for three alternative anti-surface warfare (ASUW) platforms that will conduct operations in multi-service regional scenarios. Estimated program costs, historical cost variances, and measures of operational effectiveness are determined for each COEA alternative, and service life extension effects are examined. The data is merged in a mixed-integer optimization model, MPA...

  8. E-learning platform for automated testing of electronic circuits using signature analysis method

    Science.gov (United States)

    Gherghina, Cǎtǎlina; Bacivarov, Angelica; Bacivarov, Ioan C.; Petricǎ, Gabriel

    2016-12-01

    Dependability of electronic circuits can be ensured only through testing of circuit modules. This is done by generating test vectors and their application to the circuit. Testability should be viewed as a concerted effort to ensure maximum efficiency throughout the product life cycle, from conception and design stage, through production to repairs during products operating. In this paper, is presented the platform developed by authors for training for testability in electronics, in general and in using signature analysis method, in particular. The platform allows highlighting the two approaches in the field namely analog and digital signature of circuits. As a part of this e-learning platform, it has been developed a database for signatures of different electronic components meant to put into the spotlight different techniques implying fault detection, and from this there were also self-repairing techniques of the systems with this kind of components. An approach for realizing self-testing circuits based on MATLAB environment and using signature analysis method is proposed. This paper analyses the benefits of signature analysis method and simulates signature analyzer performance based on the use of pseudo-random sequences, too.

  9. An ESL Approach for Energy Consumption Analysis of Cache Memories in SoC Platforms

    Directory of Open Access Journals (Sweden)

    Abel G. Silva-Filho

    2011-01-01

    Full Text Available The design of complex circuits as SoCs presents two great challenges to designers. One is the speeding up of system functionality modeling and the second is the implementation of the system in an architecture that meets performance and power consumption requirements. Thus, developing new high-level specification mechanisms for the reduction of the design effort with automatic architecture exploration is a necessity. This paper proposes an Electronic-System-Level (ESL approach for system modeling and cache energy consumption analysis of SoCs called PCacheEnergyAnalyzer. It uses as entry a high-level UML-2.0 profile model of the system and it generates a simulation model of a multicore platform that can be analyzed for cache tuning. PCacheEnergyAnalyzer performs static/dynamic energy consumption analysis of caches on platforms that may have different processors. Architecture exploration is achieved by letting designers choose different processors for platform generation and different mechanisms for cache optimization. PCacheEnergyAnalyzer has been validated with several applications of Mibench, Mediabench, and PowerStone benchmarks, and results show that it provides analysis with reduced simulation effort.

  10. Massively Parallel, Molecular Analysis Platform Developed Using a CMOS Integrated Circuit With Biological Nanopores

    Science.gov (United States)

    Roever, Stefan

    2012-01-01

    A massively parallel, low cost molecular analysis platform will dramatically change the nature of protein, molecular and genomics research, DNA sequencing, and ultimately, molecular diagnostics. An integrated circuit (IC) with 264 sensors was fabricated using standard CMOS semiconductor processing technology. Each of these sensors is individually controlled with precision analog circuitry and is capable of single molecule measurements. Under electronic and software control, the IC was used to demonstrate the feasibility of creating and detecting lipid bilayers and biological nanopores using wild type α-hemolysin. The ability to dynamically create bilayers over each of the sensors will greatly accelerate pore development and pore mutation analysis. In addition, the noise performance of the IC was measured to be 30fA(rms). With this noise performance, single base detection of DNA was demonstrated using α-hemolysin. The data shows that a single molecule, electrical detection platform using biological nanopores can be operationalized and can ultimately scale to millions of sensors. Such a massively parallel platform will revolutionize molecular analysis and will completely change the field of molecular diagnostics in the future.

  11. A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

    Science.gov (United States)

    Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

    2016-02-01

    Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.

  12. Common Factor Analysis Versus Principal Component Analysis: Choice for Symptom Cluster Research

    Directory of Open Access Journals (Sweden)

    Hee-Ju Kim, PhD, RN

    2008-03-01

    Conclusion: If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research, CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.

  13. Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.

    Science.gov (United States)

    Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo

    2018-07-01

    Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. A Short Survey on the State of the Art in Architectures and Platforms for Large Scale Data Analysis and Knowledge Discovery from Data

    Energy Technology Data Exchange (ETDEWEB)

    Begoli, Edmon [ORNL

    2012-01-01

    Intended as a survey for practicing architects and researchers seeking an overview of the state-of-the-art architectures for data analysis, this paper provides an overview of the emerg- ing data management and analytic platforms including par- allel databases, Hadoop-based systems, High Performance Computing (HPC) platforms and platforms popularly re- ferred to as NoSQL platforms. Platforms are presented based on their relevance, analysis they support and the data organization model they support.

  15. The Flemish frozen-vegetable industry as an example of cluster analysis : Flanders Vegetable Valley

    NARCIS (Netherlands)

    Vanhaverbeke, W.P.M.; Larosse, J.; Winnen, W.; Hulsink, W.; Dons, J.J.M.

    2008-01-01

    In this contribution we present a strategic analysis of the cluster dynamics in the frozen-vegetable industry in Flanders (Belgium)1. The main purpose of this case is twofold. First, we determine the added value of using data about customer and supplier relationships in cluster analysis. Second, we

  16. Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

    Science.gov (United States)

    Brown, S. J.; White, S.; Power, N.

    2015-01-01

    A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…

  17. Performance Analysis of Cluster Formation in Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Edgar Romo Montiel

    2017-12-01

    Full Text Available Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  18. Performance Analysis of Cluster Formation in Wireless Sensor Networks.

    Science.gov (United States)

    Montiel, Edgar Romo; Rivero-Angeles, Mario E; Rubino, Gerardo; Molina-Lozano, Heron; Menchaca-Mendez, Rolando; Menchaca-Mendez, Ricardo

    2017-12-13

    Clustered-based wireless sensor networks have been extensively used in the literature in order to achieve considerable energy consumption reductions. However, two aspects of such systems have been largely overlooked. Namely, the transmission probability used during the cluster formation phase and the way in which cluster heads are selected. Both of these issues have an important impact on the performance of the system. For the former, it is common to consider that sensor nodes in a clustered-based Wireless Sensor Network (WSN) use a fixed transmission probability to send control data in order to build the clusters. However, due to the highly variable conditions experienced by these networks, a fixed transmission probability may lead to extra energy consumption. In view of this, three different transmission probability strategies are studied: optimal, fixed and adaptive. In this context, we also investigate cluster head selection schemes, specifically, we consider two intelligent schemes based on the fuzzy C-means and k-medoids algorithms and a random selection with no intelligence. We show that the use of intelligent schemes greatly improves the performance of the system, but their use entails higher complexity and selection delay. The main performance metrics considered in this work are energy consumption, successful transmission probability and cluster formation latency. As an additional feature of this work, we study the effect of errors in the wireless channel and the impact on the performance of the system under the different transmission probability schemes.

  19. Higgs pair production: choosing benchmarks with cluster analysis

    Energy Technology Data Exchange (ETDEWEB)

    Carvalho, Alexandra; Dall’Osso, Martino; Dorigo, Tommaso [Dipartimento di Fisica e Astronomia and INFN, Sezione di Padova,Via Marzolo 8, I-35131 Padova (Italy); Goertz, Florian [CERN,1211 Geneva 23 (Switzerland); Gottardo, Carlo A. [Physikalisches Institut, Universität Bonn,Nussallee 12, 53115 Bonn (Germany); Tosi, Mia [CERN,1211 Geneva 23 (Switzerland)

    2016-04-20

    New physics theories often depend on a large number of free parameters. The phenomenology they predict for fundamental physics processes is in some cases drastically affected by the precise value of those free parameters, while in other cases is left basically invariant at the level of detail experimentally accessible. When designing a strategy for the analysis of experimental data in the search for a signal predicted by a new physics model, it appears advantageous to categorize the parameter space describing the model according to the corresponding kinematical features of the final state. A multi-dimensional test statistic can be used to gauge the degree of similarity in the kinematics predicted by different models; a clustering algorithm using that metric may allow the division of the space into homogeneous regions, each of which can be successfully represented by a benchmark point. Searches targeting those benchmarks are then guaranteed to be sensitive to a large area of the parameter space. In this document we show a practical implementation of the above strategy for the study of non-resonant production of Higgs boson pairs in the context of extensions of the standard model with anomalous couplings of the Higgs bosons. A non-standard value of those couplings may significantly enhance the Higgs boson pair-production cross section, such that the process could be detectable with the data that the LHC will collect in Run 2.

  20. Clusters of galaxies as tools in observational cosmology : results from x-ray analysis

    International Nuclear Information System (INIS)

    Weratschnig, J.M.

    2009-01-01

    Clusters of galaxies are the largest gravitationally bound structures in the universe. They can be used as ideal tools to study large scale structure formation (e.g. when studying merger clusters) and provide highly interesting environments to analyse several characteristic interaction processes (like ram pressure stripping of galaxies, magnetic fields). In this dissertation thesis, we have studied several clusters of galaxies using X-ray observations. To obtain scientific results, we have applied different data reduction and analysis methods. With a combination of morphological and spectral analysis, the merger cluster Abell 514 was studied in much detail. It has a highly interesting morphology and shows signs for an ongoing merger as well as a shock. using a new method to detect substructure, we have analysed several clusters to determine whether any substructure is present in the X-ray image. This hints towards a real structure in the distribution of the intra-cluster medium (ICM) and is evidence for ongoing mergers. The results from this analysis are extensively used with the cluster of galaxies Abell S1136. Here, we study the ICM distribution and compare its structure with the spatial distribution of star forming galaxies. Cluster magnetic fields are another important topic of my thesis. They can be studied in Radio observations, which can be put into relation with results from X-ray observations. using observational data from several clusters, we could support the theory that cluster magnetic fields are frozen into the ICM. (author)

  1. Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine.

    Science.gov (United States)

    Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang

    2017-01-01

    Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.

  2. Phenotypic clustering: a novel method for microglial morphology analysis.

    Science.gov (United States)

    Verdonk, Franck; Roux, Pascal; Flamant, Patricia; Fiette, Laurence; Bozza, Fernando A; Simard, Sébastien; Lemaire, Marc; Plaud, Benoit; Shorte, Spencer L; Sharshar, Tarek; Chrétien, Fabrice; Danckaert, Anne

    2016-06-17

    Microglial cells are tissue-resident macrophages of the central nervous system. They are extremely dynamic, sensitive to their microenvironment and present a characteristic complex and heterogeneous morphology and distribution within the brain tissue. Many experimental clues highlight a strong link between their morphology and their function in response to aggression. However, due to their complex "dendritic-like" aspect that constitutes the major pool of murine microglial cells and their dense network, precise and powerful morphological studies are not easy to realize and complicate correlation with molecular or clinical parameters. Using the knock-in mouse model CX3CR1(GFP/+), we developed a 3D automated confocal tissue imaging system coupled with morphological modelling of many thousands of microglial cells revealing precise and quantitative assessment of major cell features: cell density, cell body area, cytoplasm area and number of primary, secondary and tertiary processes. We determined two morphological criteria that are the complexity index (CI) and the covered environment area (CEA) allowing an innovative approach lying in (i) an accurate and objective study of morphological changes in healthy or pathological condition, (ii) an in situ mapping of the microglial distribution in different neuroanatomical regions and (iii) a study of the clustering of numerous cells, allowing us to discriminate different sub-populations. Our results on more than 20,000 cells by condition confirm at baseline a regional heterogeneity of the microglial distribution and phenotype that persists after induction of neuroinflammation by systemic injection of lipopolysaccharide (LPS). Using clustering analysis, we highlight that, at resting state, microglial cells are distributed in four microglial sub-populations defined by their CI and CEA with a regional pattern and a specific behaviour after challenge. Our results counteract the classical view of a homogenous regional resting

  3. Cluster Computing For Real Time Seismic Array Analysis.

    Science.gov (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  4. Global classification of human facial healthy skin using PLS discriminant analysis and clustering analysis.

    Science.gov (United States)

    Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J

    2001-04-01

    Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.

  5. Performance Analysis of Multiple Wave Energy Converters Placed on a Floating Platform in the Frequency Domain

    Directory of Open Access Journals (Sweden)

    Hyebin Lee

    2018-02-01

    Full Text Available Wind-wave hybrid power generation systems have the potential to become a significant source of affordable renewable energy. However, their strong interactions with both wind- and wave-induced forces raise a number of technical challenges for modelling. The present study undertakes a numerical investigation on multi-body hydrodynamic interaction between a wind-wave hybrid floating platform and multiple wave energy converters (WECs in a frequency domain. In addition to the exact responses of the platform and the WECs, the power take-off (PTO mechanism was taken into account for analysis. The coupled hydrodynamic coefficients and wave exciting forces were obtained from WAMIT, the 3D diffraction/radiation solver based on the boundary element method. The overall performance of the multiple WECs is presented and compared with the performance of a single isolated WEC. The analysis showed significant differences in the dynamic responses of the WECs when the multi-body interaction was considered. In addition, the PTO damping effect made a considerable difference to the responses of the WECs. However, the platform response was only minimally affected by PTO damping. With regard to energy capture, the interaction effect of the designed multiple WEC array layout is evaluated. The WEC array configuration showed both constructive and destructive effects in accordance with the incident wave frequency and direction.

  6. Application and Mechanics Analysis of Multi-Function Construction Platforms in Prefabricated-Concrete Construction

    Science.gov (United States)

    Wang, Meihua; Li, Rongshuai; Zhang, Wenze

    2017-11-01

    Multi-function construction platforms (MCPs) as an “old construction technology, new application” of the building facade construction equipment, its efforts to reduce labour intensity, improve labour productivity, ensure construction safety, shorten the duration of construction and other aspects of the effect are significant. In this study, the functional analysis of the multi-function construction platforms is carried out in the construction of the assembly building. Based on the general finite element software ANSYS, the static calculation and dynamic characteristics analysis of the MCPs structure are analysed, the simplified finite element model is constructed, and the selection of the unit, the processing and solution of boundary are under discussion and research. The maximum deformation value, the maximum stress value and the structural dynamic characteristic model are obtained. The dangerous parts of the platform structure are analysed, too. Multiple types of MCPs under engineering construction conditions are calculated, so as to put forward the rationalization suggestions for engineering application of the MCPs.

  7. Comparative analysis of clustering methods for gene expression time course data

    Directory of Open Access Journals (Sweden)

    Ivan G. Costa

    2004-01-01

    Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

  8. Description and analysis of design and intended use for Epidemiologic Dynamic Data Collection Platform in China.

    Science.gov (United States)

    Qi, Xiaopeng; Egana, Nilva; Meng, Yujie; Chen, Qianqian; Peng, Zhiyong; Ma, Jiaqi

    2014-01-01

    Disease surveillance systems can be extremely valuable tools and a critical step in system implementation is data collection. In order to obtain quality data efficiently and align the public health business process, Epidemiologic Dynamic Data Collection platform (EDDC) was developed and applied in China. We describe the design of EDDC and assess the platform from six dimensions (service, system, information, use, users and benefit) under the DeLone and McLean Information System Success Model. Objective indicators were extracted from each dimension with the aim of describing the system in detail. The characteristics of functions, performances, usages and benefits of EDDC were reflected under the analysis framework. The limitations and future directions of EDDC are offered for wide use in public health data collection.

  9. Long Term Analysis of Adaptive Low-Power Instrument Platform Power and Battery Performance

    Science.gov (United States)

    Edwards, T.; Bowman, J. R.; Clauer, C. R.

    2017-12-01

    Operation of the Autonomous Adaptive Low-Power Instrument Platform (AAL-PIP) by the Magnetosphere-Ionosphere Science Team (MIST) at Virginia Tech has been ongoing for about 10 years. These instrument platforms are deployed on the East Antarctic Plateau in remote locations that are difficult to access regularly. The systems have been designed to operate unattended for at least 5 years. During the Austral summer, the systems charge batteries using solar panels and power is provided by the batteries during the winter months. If the voltage goes below a critical level, the systems go into hibernation and wait for voltage from the solar panels to initiate a restart sequence to begin operation and battery charging. Our first system was deployed on the East Antarctic Plateau in 2008 and we report here on an analysis of the power and battery performance over multiple years and provide an estimate for how long these systems can operate before major battery maintenance must be performed.

  10. Kameleon Live: An Interactive Cloud Based Analysis and Visualization Platform for Space Weather Researchers

    Science.gov (United States)

    Pembroke, A. D.; Colbert, J. A.

    2015-12-01

    The Community Coordinated Modeling Center (CCMC) provides hosting for many of the simulations used by the space weather community of scientists, educators, and forecasters. CCMC users may submit model runs through the Runs on Request system, which produces static visualizations of model output in the browser, while further analysis may be performed off-line via Kameleon, CCMC's cross-language access and interpolation library. Off-line analysis may be suitable for power-users, but storage and coding requirements present a barrier to entry for non-experts. Moreover, a lack of a consistent framework for analysis hinders reproducibility of scientific findings. To that end, we have developed Kameleon Live, a cloud based interactive analysis and visualization platform. Kameleon Live allows users to create scientific studies built around selected runs from the Runs on Request database, perform analysis on those runs, collaborate with other users, and disseminate their findings among the space weather community. In addition to showcasing these novel collaborative analysis features, we invite feedback from CCMC users as we seek to advance and improve on the new platform.

  11. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    Science.gov (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  12. Following User Pathways: Cross Platform and Mixed Methods Analysis in Social Media Studies

    DEFF Research Database (Denmark)

    Hall, Margeret; Mazarakis, Athanasios; Peters, Isabella

    2016-01-01

    is the mixed method approach (e.g. qualitative and quantitative methods) in order to better understand how users and society interacts online. The workshop 'Following User Pathways' brings together a community of researchers and professionals to address methodological, analytical, conceptual, and technological......Social media and the resulting tidal wave of available data have changed the ways and methods researchers analyze communities at scale. But the full potential for social scientists (and others) is not yet achieved. Despite the popularity of social media analysis in the past decade, few researchers...... challenges and opportunities of cross-platform, mixed method analysis in social media ecosystems....

  13. Near Real-time Scientific Data Analysis and Visualization with the ArcGIS Platform

    Science.gov (United States)

    Shrestha, S. R.; Viswambharan, V.; Doshi, A.

    2017-12-01

    Scientific multidimensional data are generated from a variety of sources and platforms. These datasets are mostly produced by earth observation and/or modeling systems. Agencies like NASA, NOAA, USGS, and ESA produce large volumes of near real-time observation, forecast, and historical data that drives fundamental research and its applications in larger aspects of humanity from basic decision making to disaster response. A common big data challenge for organizations working with multidimensional scientific data and imagery collections is the time and resources required to manage and process such large volumes and varieties of data. The challenge of adopting data driven real-time visualization and analysis, as well as the need to share these large datasets, workflows, and information products to wider and more diverse communities, brings an opportunity to use the ArcGIS platform to handle such demand. In recent years, a significant effort has put in expanding the capabilities of ArcGIS to support multidimensional scientific data across the platform. New capabilities in ArcGIS to support scientific data management, processing, and analysis as well as creating information products from large volumes of data using the image server technology are becoming widely used in earth science and across other domains. We will discuss and share the challenges associated with big data by the geospatial science community and how we have addressed these challenges in the ArcGIS platform. We will share few use cases, such as NOAA High Resolution Refresh Radar (HRRR) data, that demonstrate how we access large collections of near real-time data (that are stored on-premise or on the cloud), disseminate them dynamically, process and analyze them on-the-fly, and serve them to a variety of geospatial applications. We will also share how on-the-fly processing using raster functions capabilities, can be extended to create persisted data and information products using raster analytics

  14. OMERACT-based fibromyalgia symptom subgroups: an exploratory cluster analysis.

    Science.gov (United States)

    Vincent, Ann; Hoskin, Tanya L; Whipple, Mary O; Clauw, Daniel J; Barton, Debra L; Benzo, Roberto P; Williams, David A

    2014-10-16

    The aim of this study was to identify subsets of patients with fibromyalgia with similar symptom profiles using the Outcome Measures in Rheumatology (OMERACT) core symptom domains. Female patients with a diagnosis of fibromyalgia and currently meeting fibromyalgia research survey criteria completed the Brief Pain Inventory, the 30-item Profile of Mood States, the Medical Outcomes Sleep Scale, the Multidimensional Fatigue Inventory, the Multiple Ability Self-Report Questionnaire, the Fibromyalgia Impact Questionnaire-Revised (FIQ-R) and the Short Form-36 between 1 June 2011 and 31 October 2011. Hierarchical agglomerative clustering was used to identify subgroups of patients with similar symptom profiles. To validate the results from this sample, hierarchical agglomerative clustering was repeated in an external sample of female patients with fibromyalgia with similar inclusion criteria. A total of 581 females with a mean age of 55.1 (range, 20.1 to 90.2) years were included. A four-cluster solution best fit the data, and each clustering variable differed significantly (P FIQ-R total scores (P = 0.0004)). In our study, we incorporated core OMERACT symptom domains, which allowed for clustering based on a comprehensive symptom profile. Although our exploratory cluster solution needs confirmation in a longitudinal study, this approach could provide a rationale to support the study of individualized clinical evaluation and intervention.

  15. Comparison of Outputs for Variable Combinations Used in Cluster Analysis on Polarmetric Imagery

    National Research Council Canada - National Science Library

    Petre, Melinda

    2008-01-01

    .... More specifically, two techniques, Cluster Analysis (CA) and Principle Component Analysis (PCA) can be combined to process Stoke s imagery by distinguishing between pixels, and producing groups of pixels with similar characteristics...

  16. Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

    Science.gov (United States)

    Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard

    2015-01-01

    Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, ppeople living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.

  17. The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

    Science.gov (United States)

    Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

    2018-04-01

    The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.

  18. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

    Science.gov (United States)

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

    2011-11-01

    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  19. Global myeloma research clusters, output, and citations: a bibliometric mapping and clustering analysis.

    Directory of Open Access Journals (Sweden)

    Jens Peter Andersen

    Full Text Available International collaborative research is a mechanism for improving the development of disease-specific therapies and for improving health at the population level. However, limited data are available to assess the trends in research output related to orphan diseases.We used bibliometric mapping and clustering methods to illustrate the level of fragmentation in myeloma research and the development of collaborative efforts. Publication data from Thomson Reuters Web of Science were retrieved for 2005-2009 and followed until 2013. We created a database of multiple myeloma publications, and we analysed impact and co-authorship density to identify scientific collaborations, developments, and international key players over time. The global annual publication volume for studies on multiple myeloma increased from 1,144 in 2005 to 1,628 in 2009, which represents a 43% increase. This increase is high compared to the 24% and 14% increases observed for lymphoma and leukaemia. The major proportion (>90% of publications was from the US and EU over the study period. The output and impact in terms of citations, identified several successful groups with a large number of intra-cluster collaborations in the US and EU. The US-based myeloma clusters clearly stand out as the most productive and highly cited, and the European Myeloma Network members exhibited a doubling of collaborative publications from 2005 to 2009, still increasing up to 2013.Multiple myeloma research output has increased substantially in the past decade. The fragmented European myeloma research activities based on national or regional groups are progressing, but they require a broad range of targeted research investments to improve multiple myeloma health care.

  20. An automated robotic platform for rapid profiling oligosaccharide analysis of monoclonal antibodies directly from cell culture.

    Science.gov (United States)

    Doherty, Margaret; Bones, Jonathan; McLoughlin, Niaobh; Telford, Jayne E; Harmon, Bryan; DeFelippis, Michael R; Rudd, Pauline M

    2013-11-01

    Oligosaccharides attached to Asn297 in each of the CH2 domains of monoclonal antibodies play an important role in antibody effector functions by modulating the affinity of interaction with Fc receptors displayed on cells of the innate immune system. Rapid, detailed, and quantitative N-glycan analysis is required at all stages of bioprocess development to ensure the safety and efficacy of the therapeutic. The high sample numbers generated during quality by design (QbD) and process analytical technology (PAT) create a demand for high-performance, high-throughput analytical technologies for comprehensive oligosaccharide analysis. We have developed an automated 96-well plate-based sample preparation platform for high-throughput N-glycan analysis using a liquid handling robotic system. Complete process automation includes monoclonal antibody (mAb) purification directly from bioreactor media, glycan release, fluorescent labeling, purification, and subsequent ultra-performance liquid chromatography (UPLC) analysis. The entire sample preparation and commencement of analysis is achieved within a 5-h timeframe. The automated sample preparation platform can easily be interfaced with other downstream analytical technologies, including mass spectrometry (MS) and capillary electrophoresis (CE), for rapid characterization of oligosaccharides present on therapeutic antibodies. Copyright © 2013 Elsevier Inc. All rights reserved.

  1. Product competitiveness analysis for e-commerce platform of special agricultural products

    Science.gov (United States)

    Wan, Fucheng; Ma, Ning; Yang, Dongwei; Xiong, Zhangyuan

    2017-09-01

    On the basis of analyzing the influence factors of the product competitiveness of the e-commerce platform of the special agricultural products and the characteristics of the analytical methods for the competitiveness of the special agricultural products, the price, the sales volume, the postage included service, the store reputation, the popularity, etc. were selected in this paper as the dimensionality for analyzing the competitiveness of the agricultural products, and the principal component factor analysis was taken as the competitiveness analysis method. Specifically, the web crawler was adopted to capture the information of various special agricultural products in the e-commerce platform ---- chi.taobao.com. Then, the original data captured thereby were preprocessed and MYSQL database was adopted to establish the information library for the special agricultural products. Then, the principal component factor analysis method was adopted to establish the analysis model for the competitiveness of the special agricultural products, and SPSS was adopted in the principal component factor analysis process to obtain the competitiveness evaluation factor system (support degree factor, price factor, service factor and evaluation factor) of the special agricultural products. Then, the linear regression method was adopted to establish the competitiveness index equation of the special agricultural products for estimating the competitiveness of the special agricultural products.

  2. Topic modeling for cluster analysis of large biological and medical datasets.

    Science.gov (United States)

    Zhao, Weizhong; Zou, Wen; Chen, James J

    2014-01-01

    The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting

  3. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

    Science.gov (United States)

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

    2017-11-01

    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease

  4. Using MATLAB software with Tomcat server and Java platform for remote image analysis in pathology.

    Science.gov (United States)

    Markiewicz, Tomasz

    2011-03-30

    The Matlab software is a one of the most advanced development tool for application in engineering practice. From our point of view the most important is the image processing toolbox, offering many built-in functions, including mathematical morphology, and implementation of a many artificial neural networks as AI. It is very popular platform for creation of the specialized program for image analysis, also in pathology. Based on the latest version of Matlab Builder Java toolbox, it is possible to create the software, serving as a remote system for image analysis in pathology via internet communication. The internet platform can be realized based on Java Servlet Pages with Tomcat server as servlet container. In presented software implementation we propose remote image analysis realized by Matlab algorithms. These algorithms can be compiled to executable jar file with the help of Matlab Builder Java toolbox. The Matlab function must be declared with the set of input data, output structure with numerical results and Matlab web figure. Any function prepared in that manner can be used as a Java function in Java Servlet Pages (JSP). The graphical user interface providing the input data and displaying the results (also in graphical form) must be implemented in JSP. Additionally the data storage to database can be implemented within algorithm written in Matlab with the help of Matlab Database Toolbox directly with the image processing. The complete JSP page can be run by Tomcat server. The proposed tool for remote image analysis was tested on the Computerized Analysis of Medical Images (CAMI) software developed by author. The user provides image and case information (diagnosis, staining, image parameter etc.). When analysis is initialized, input data with image are sent to servlet on Tomcat. When analysis is done, client obtains the graphical results as an image with marked recognized cells and also the quantitative output. Additionally, the results are stored in a server

  5. Cluster analysis of tropical cyclone tracks in the Southern Hemisphere

    Energy Technology Data Exchange (ETDEWEB)

    Ramsay, Hamish A. [Monash University, Monash Weather and Climate, School of Mathematical Sciences, Clayton, VIC (Australia); Camargo, Suzana J.; Kim, Daehyun [Columbia University, Lamont-Doherty Earth Observatory, Palisades, NY (United States)

    2012-08-15

    A probabilistic clustering method is used to describe various aspects of tropical cyclone (TC) tracks in the Southern Hemisphere, for the period 1969-2008. A total of 7 clusters are examined: three in the South Indian Ocean, three in the Australian Region, and one in the South Pacific Ocean. Large-scale environmental variables related to TC genesis in each cluster are explored, including sea surface temperature, low-level relative vorticity, deep-layer vertical wind shear, outgoing longwave radiation, El Nino-Southern Oscillation (ENSO) and the Madden-Julian Oscillation (MJO). Composite maps, constructed 2 days prior to genesis, show some of these to be significant precursors to TC formation - most prominently, westerly wind anomalies equatorward of the main development regions. Clusters are also evaluated with respect to their genesis location, seasonality, mean peak intensity, track duration, landfall location, and intensity at landfall. ENSO is found to play a significant role in modulating annual frequency and mean genesis location in three of the seven clusters (two in the South Indian Ocean and one in the Pacific). The ENSO-modulating effect on genesis frequency is caused primarily by changes in low-level zonal flow between the equator and 10 S, and associated relative vorticity changes in the main development regions. ENSO also has a significant effect on mean genesis location in three clusters, with TCs forming further equatorward (poleward) during El Nino (La Nina) in addition to large shifts in mean longitude. The MJO has a strong influence on TC genesis in all clusters, though the amount modulation is found to be sensitive to the definition of the MJO. (orig.)

  6. A clustering analysis of lipoprotein diameters in the metabolic syndrome

    Directory of Open Access Journals (Sweden)

    Frazier-Wood Alexis C

    2011-12-01

    Full Text Available Abstract Background The presence of smaller low-density lipoproteins (LDL has been associated with atherosclerosis risk, and the insulin resistance (IR underlying the metabolic syndrome (MetS. In addition, some research has supported the association of very low-, low- and high-density lipoprotein (VLDL HDL particle diameters with components of the metabolic syndrome (MetS, although this has been the focus of less research. We aimed to explore the relationship of VLDL, LDL and HDL diameters to MetS and its features, and by clustering individuals by their diameters of VLDL, LDL and HDL particles, to capture information across all three fractions of lipoprotein into a unified phenotype. Methods We used nuclear magnetic resonance spectroscopy measurements on fasting plasma samples from a general population sample of 1,036 adults (mean ± SD, 48.8 ± 16.2 y of age. Using latent class analysis, the sample was grouped by the diameter of their fasting lipoproteins, and mixed effects models tested whether the distribution of MetS components varied across the groups. Results Eight discrete groups were identified. Two groups (N = 251 were enriched with individuals meeting criteria for the MetS, and were characterized by the smallest LDL/HDL diameters. One of those two groups, one was additionally distinguished by large VLDL, and had significantly higher blood pressure, fasting glucose, triglycerides, and waist circumference (WC; P Conclusions While small LDL diameters remain associated with IR and the MetS, the occurrence of these in conjunction with a shift to overall larger VLDL diameter may identify those with the highest fasting glucose, TG and WC within the MetS. If replicated, the association of this phenotype with more severe IR-features indicated that it may contribute to identifying of those most at risk for incident type II diabetes and cardiometabolic disease.

  7. Methodology сomparative statistical analysis of Russian industry based on cluster analysis

    Directory of Open Access Journals (Sweden)

    Sergey S. Shishulin

    2017-01-01

    Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of

  8. A hybrid approach to device integration on a genetic analysis platform

    International Nuclear Information System (INIS)

    Brennan, Des; Justice, John; Aherne, Margaret; Galvin, Paul; Jary, Dorothee; Kurg, Ants; Berik, Evgeny; Macek, Milan

    2012-01-01

    Point-of-care (POC) systems require significant component integration to implement biochemical protocols associated with molecular diagnostic assays. Hybrid platforms where discrete components are combined in a single platform are a suitable approach to integration, where combining multiple device fabrication steps on a single substrate is not possible due to incompatible or costly fabrication steps. We integrate three devices each with a specific system functionality: (i) a silicon electro-wetting-on-dielectric (EWOD) device to move and mix sample and reagent droplets in an oil phase, (ii) a polymer microfluidic chip containing channels and reservoirs and (iii) an aqueous phase glass microarray for fluorescence microarray hybridization detection. The EWOD device offers the possibility of fully integrating on-chip sample preparation using nanolitre sample and reagent volumes. A key challenge is sample transfer from the oil phase EWOD device to the aqueous phase microarray for hybridization detection. The EWOD device, waveguide performance and functionality are maintained during the integration process. An on-chip biochemical protocol for arrayed primer extension (APEX) was implemented for single nucleotide polymorphism (SNiP) analysis. The prepared sample is aspirated from the EWOD oil phase to the aqueous phase microarray for hybridization. A bench-top instrumentation system was also developed around the integrated platform to drive the EWOD electrodes, implement APEX sample heating and image the microarray after hybridization. (paper)

  9. Open innovation in health care: analysis of an open health platform.

    Science.gov (United States)

    Bullinger, Angelika C; Rass, Matthias; Adamczyk, Sabrina; Moeslein, Kathrin M; Sohn, Stefan

    2012-05-01

    Today, integration of the public in research and development in health care is seen as essential for the advancement of innovation. This is a paradigmatic shift away from the traditional assumption that solely health care professionals are able to devise, develop, and disseminate novel concepts and solutions in health care. The present study builds on research in the field of open innovation to investigate the adoption of an open health platform by patients, care givers, physicians, family members, and the interested public. Results suggest that open innovation practices in health care lead to interesting innovation outcomes and are well accepted by participants. During the first three months, 803 participants of the open health platform submitted challenges and solutions and intensively communicated by exchanging 1454 personal messages and 366 comments. Analysis of communication content shows that empathic support and exchange of information are important elements of communication on the platform. The study presents first evidence for the suitability of open innovation practices to integrate the general public in health care research in order to foster both innovation outcomes and empathic support. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  10. How Digital Are the Digital Humanities? An Analysis of Two Scholarly Blogging Platforms

    Science.gov (United States)

    Puschmann, Cornelius; Bastos, Marco

    2015-01-01

    In this paper we compare two academic networking platforms, HASTAC and Hypotheses, to show the distinct ways in which they serve specific communities in the Digital Humanities (DH) in different national and disciplinary contexts. After providing background information on both platforms, we apply co-word analysis and topic modeling to show thematic similarities and differences between the two sites, focusing particularly on how they frame DH as a new paradigm in humanities research. We encounter a much higher ratio of posts using humanities-related terms compared to their digital counterparts, suggesting a one-way dependency of digital humanities-related terms on the corresponding unprefixed labels. The results also show that the terms digital archive, digital literacy, and digital pedagogy are relatively independent from the respective unprefixed terms, and that digital publishing, digital libraries, and digital media show considerable cross-pollination between the specialization and the general noun. The topic modeling reproduces these findings and reveals further differences between the two platforms. Our findings also indicate local differences in how the emerging field of DH is conceptualized and show dynamic topical shifts inside these respective contexts. PMID:25675441

  11. Proposal for a Mini Wireless Force Platform for Human Gait Analysis

    Directory of Open Access Journals (Sweden)

    Giovani PIFFER

    2011-12-01

    Full Text Available This paper aims to develop a mini wireless force platform placed in the shoe sole for analysis of human gait. The platform consists of a machined aluminum mechanical structure fixed into a sole, whose sensors are electrical resistance strain gages strategically cemented at the points of greatest deformation of the structure. The strain gages are configured as a ½ Wheatstone bridge connected to an amplifier for output signals and filtered by a signal conditioner. The signals are conditioned using a data acquisition board in conjunction with a graphical interface developed in LabVIEW. The static and dynamic behavior of the eight load cells was evaluated. Calibration at static pressures has shown that the eight load cells are linear within the usage range from 0 kgf to 45 kgf. The dynamic response has determined that the first vibration mode is around 1 kHz, indicating that the load cells have no resonance during the test. Three subjects carried out gait tests to examine the range of force platform use, and these tests demonstrated that the signals obtained are consistent with the classical references in this area.

  12. Miniaturization for Point-of-Care Analysis: Platform Technology for Almost Every Biomedical Assay.

    Science.gov (United States)

    Schumacher, Soeren; Sartorius, Dorian; Ehrentreich-Förster, Eva; Bier, Frank F

    2012-10-01

    Platform technologies for the changing need of diagnostics are one of the main challenges in medical device technology. From one point-of-view the demand for new and more versatile diagnostic is increasing due to a deeper knowledge of biomarkers and their combination with diseases. From another point-of-view a decentralization of diagnostics will occur since decisions can be made faster resulting in higher success of therapy. Hence, new types of technologies have to be established which enables a multiparameter analysis at the point-of-care. Within this review-like article a system called Fraunhofer ivD-platform is introduced. It consists of a credit-card sized cartridge with integrated reagents, sensors and pumps and a read-out/processing unit. Within the cartridge the assay runs fully automated within 15-20 minutes. Due to the open design of the platform different analyses such as antibody, serological or DNA-assays can be performed. Specific examples of these three different assay types are given to show the broad applicability of the system.

  13. ACCURACY ANALYSIS OF A LOW-COST PLATFORM FOR POSITIONING AND NAVIGATION

    Directory of Open Access Journals (Sweden)

    S. Hofmann

    2012-07-01

    Full Text Available This paper presents an accuracy analysis of a platform based on low-cost components for landmark-based navigation intended for research and teaching purposes. The proposed platform includes a LEGO MINDSTORMS NXT 2.0 kit, an Android-based Smartphone as well as a compact laser scanner Hokuyo URG-04LX. The robot is used in a small indoor environment, where GNSS is not available. Therefore, a landmark map was produced in advance, with the landmark positions provided to the robot. All steps of procedure to set up the platform are shown. The main focus of this paper is the reachable positioning accuracy, which was analyzed in this type of scenario depending on the accuracy of the reference landmarks and the directional and distance measuring accuracy of the laser scanner. Several experiments were carried out, demonstrating the practically achievable positioning accuracy. To evaluate the accuracy, ground truth was acquired using a total station. These results are compared to the theoretically achievable accuracies and the laser scanner’s characteristics.

  14. Cluster-cluster clustering

    International Nuclear Information System (INIS)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.; Yale Univ., New Haven, CT; California Univ., Santa Barbara; Cambridge Univ., England; Sussex Univ., Brighton, England)

    1985-01-01

    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references

  15. Planetary-Scale Geospatial Data Analysis Techniques in Google's Earth Engine Platform (Invited)

    Science.gov (United States)

    Hancher, M.

    2013-12-01

    Geoscientists have more and more access to new tools for large-scale computing. With any tool, some tasks are easy and other tasks hard. It is natural to look to new computing platforms to increase the scale and efficiency of existing techniques, but there is a more exiting opportunity to discover and develop a new vocabulary of fundamental analysis idioms that are made easy and effective by these new tools. Google's Earth Engine platform is a cloud computing environment for earth data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog includes a nearly complete archive of scenes from Landsat 4, 5, 7, and 8 that have been processed by the USGS, as well as a wide variety of other remotely-sensed and ancillary data products. Earth Engine supports a just-in-time computation model that enables real-time preview during algorithm development and debugging as well as during experimental data analysis and open-ended data exploration. Data processing operations are performed in parallel across many computers in Google's datacenters. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, resampling, and associating image metadata with pixel data. Early applications of Earth Engine have included the development of Google's global cloud-free fifteen-meter base map and global multi-decadal time-lapse animations, as well as numerous large and small experimental analyses by scientists from a range of academic, government, and non-governmental institutions, working in a wide variety of application areas including forestry, agriculture, urban mapping, and species habitat modeling. Patterns in the successes and failures of these early efforts have begun to emerge, sketching the outlines of a new set of simple and effective approaches to geospatial data analysis.

  16. Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions.

    Science.gov (United States)

    Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey

    2017-09-01

    We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians

  17. Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

    Directory of Open Access Journals (Sweden)

    Sorana D. BOLBOACĂ

    2011-06-01

    Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.

  18. Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Directory of Open Access Journals (Sweden)

    Martinez Fernando J

    2010-03-01

    Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.

  19. Observations of the thermal environment on Red Sea platform reefs: a heat budget analysis

    KAUST Repository

    Davis, K. A.

    2011-03-11

    Hydrographic measurements were collected on nine offshore reef platforms in the eastern Red Sea shelf region, north of Jeddah, Saudi Arabia. The data were analyzed for spatial and temporal patterns of temperature variation, and a simple heat budget analysis was performed with the goal of advancing our understanding of the physical processes that control temperature variability on the reef. In 2009 and 2010, temperature variability on Red Sea reef platforms was dominated by diurnal variability. The daily temperature range on the reefs, at times, exceeded 5°C-as large as the annual range of water temperature on the shelf. Additionally, our observations reveal the proximity of distinct thermal microclimates within the bounds of one reef platform. Circulation on the reef flat is largely wave driven. The greatest diurnal variation in water temperature occurs in the center of larger reef flats and on reefs protected from direct wave forcing, while smaller knolls or sites on the edges of the reef flat tend to experience less diurnal temperature variability. We found that both the temporal and spatial variability in water temperature on the reef platforms is well predicted by a heat budget model that includes the transfer of heat at the air-water interface and the advection of heat by currents flowing over the reef. Using this simple model, we predicted the temperature across three different reefs to within 0.4°C on the outer shelf using only information about bathymetry, surface heat flux, and offshore wave conditions. © 2011 Springer-Verlag.

  20. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    Energy Technology Data Exchange (ETDEWEB)

    Lalonde, Michel, E-mail: mlalonde15@rogers.com; Wassenaar, Richard [Department of Physics, Carleton University, Ottawa, Ontario K1S 5B6 (Canada); Wells, R. Glenn; Birnie, David; Ruddy, Terrence D. [Division of Cardiology, University of Ottawa Heart Institute, Ottawa, Ontario K1Y 4W7 (Canada)

    2014-07-15

    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  1. Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

    International Nuclear Information System (INIS)

    Lalonde, Michel; Wassenaar, Richard; Wells, R. Glenn; Birnie, David; Ruddy, Terrence D.

    2014-01-01

    Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

  2. Nurses' beliefs about nursing diagnosis: A study with cluster analysis.

    Science.gov (United States)

    D'Agostino, Fabio; Pancani, Luca; Romero-Sánchez, José Manuel; Lumillo-Gutierrez, Iris; Paloma-Castro, Olga; Vellone, Ercole; Alvaro, Rosaria

    2018-06-01

    To identify clusters of nurses in relation to their beliefs about nursing diagnosis among two populations (Italian and Spanish); to investigate differences among clusters of nurses in each population considering the nurses' socio-demographic data, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and actual behaviours in making nursing diagnosis. Nurses' beliefs concerning nursing diagnosis can influence its use in practice but this is still unclear. A cross-sectional design. A convenience sample of nurses in Italy and Spain was enrolled. Data were collected between 2014-2015 using tools, that is, a socio-demographic questionnaire and behavioural, normative and control beliefs, attitudes, intentions and behaviours scales. The sample included 499 nurses (272 Italians & 227 Spanish). Of these, 66.5% of the Italian and 90.7% of the Spanish sample were female. The mean age was 36.5 and 45.2 years old in the Italian and Spanish sample respectively. Six clusters of nurses were identified in Spain and four in Italy. Three clusters were similar among the two populations. Similar significant associations between age, years of work, attitudes towards nursing diagnosis, intentions to make nursing diagnosis and behaviours in making nursing diagnosis and cluster membership in each population were identified. Belief profiles identified unique subsets of nurses that have distinct characteristics. Categorizing nurses by belief patterns may help administrators and educators to tailor interventions aimed at improving nursing diagnosis use in practice. © 2018 John Wiley & Sons Ltd.

  3. Cluster Analysis of Customer Reviews Extracted from Web Pages

    Directory of Open Access Journals (Sweden)

    S. Shivashankar

    2010-01-01

    Full Text Available As e-commerce is gaining popularity day by day, the web has become an excellent source for gathering customer reviews / opinions by the market researchers. The number of customer reviews that a product receives is growing at very fast rate (It could be in hundreds or thousands. Customer reviews posted on the websites vary greatly in quality. The potential customer has to read necessarily all the reviews irrespective of their quality to make a decision on whether to purchase the product or not. In this paper, we make an attempt to assess are view based on its quality, to help the customer make a proper buying decision. The quality of customer review is assessed as most significant, more significant, significant and insignificant.A novel and effective web mining technique is proposed for assessing a customer review of a particular product based on the feature clustering techniques, namely, k-means method and fuzzy c-means method. This is performed in three steps : (1Identify review regions and extract reviews from it, (2 Extract and cluster the features of reviews by a clustering technique and then assign weights to the features belonging to each of the clusters (groups and (3 Assess the review by considering the feature weights and group belongingness. The k-means and fuzzy c-means clustering techniques are implemented and tested on customer reviews extracted from web pages. Performance of these techniques are analyzed.

  4. Cytobank: providing an analytics platform for community cytometry data analysis and collaboration.

    Science.gov (United States)

    Chen, Tiffany J; Kotecha, Nikesh

    2014-01-01

    Cytometry is used extensively in clinical and laboratory settings to diagnose and track cell subsets in blood and tissue. High-throughput, single-cell approaches leveraging cytometry are developed and applied in the computational and systems biology communities by researchers, who seek to improve the diagnosis of human diseases, map the structures of cell signaling networks, and identify new cell types. Data analysis and management present a bottleneck in the flow of knowledge from bench to clinic. Multi-parameter flow and mass cytometry enable identification of signaling profiles of patient cell samples. Currently, this process is manual, requiring hours of work to summarize multi-dimensional data and translate these data for input into other analysis programs. In addition, the increase in the number and size of collaborative cytometry studies as well as the computational complexity of analytical tools require the ability to assemble sufficient and appropriately configured computing capacity on demand. There is a critical need for platforms that can be used by both clinical and basic researchers who routinely rely on cytometry. Recent advances provide a unique opportunity to facilitate collaboration and analysis and management of cytometry data. Specifically, advances in cloud computing and virtualization are enabling efficient use of large computing resources for analysis and backup. An example is Cytobank, a platform that allows researchers to annotate, analyze, and share results along with the underlying single-cell data.

  5. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  6. Clustering analysis of malware behavior using Self Organizing Map

    DEFF Research Database (Denmark)

    Pirscoveanu, Radu-Stefan; Stevanovic, Matija; Pedersen, Jens Myrup

    2016-01-01

    For the time being, malware behavioral classification is performed by means of Anti-Virus (AV) generated labels. The paper investigates the inconsistencies associated with current practices by evaluating the identified differences between current vendors. In this paper we rely on Self Organizing...... Map, an unsupervised machine learning algorithm, for generating clusters that capture the similarities between malware behavior. A data set of approximately 270,000 samples was used to generate the behavioral profile of malicious types in order to compare the outcome of the proposed clustering...... approach with the labels collected from 57 Antivirus vendors using VirusTotal. Upon evaluating the results, the paper concludes on shortcomings of relying on AV vendors for labeling malware samples. In order to solve the problem, a cluster-based classification is proposed, which should provide more...

  7. ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.

    Science.gov (United States)

    Giancarlo, R; Scaturro, D; Utro, F

    2015-02-01

    The prediction of the number of clusters in a dataset, in particular microarrays, is a fundamental task in biological data analysis, usually performed via validation measures. Unfortunately, it has received very little attention and in fact there is a growing need for software tools/libraries dedicated to it. Here we present ValWorkBench, a software library consisting of eleven well known validation measures, together with novel heuristic approximations for some of them. The main objective of this paper is to provide the interested researcher with the full software documentation of an open source cluster validation platform having the main features of being easily extendible in a homogeneous way and of offering software components that can be readily re-used. Consequently, the focus of the presentation is on the architecture of the library, since it provides an essential map that can be used to access the full software documentation, which is available at the supplementary material website [1]. The mentioned main features of ValWorkBench are also discussed and exemplified, with emphasis on software abstraction design and re-usability. A comparison with existing cluster validation software libraries, mainly in terms of the mentioned features, is also offered. It suggests that ValWorkBench is a much needed contribution to the microarray software development/algorithm engineering community. For completeness, it is important to mention that previous accurate algorithmic experimental analysis of the relative merits of each of the implemented measures [19,23,25], carried out specifically on microarray data, gives useful insights on the effectiveness of ValWorkBench for cluster validation to researchers in the microarray community interested in its use for the mentioned task. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  8. Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

    Science.gov (United States)

    Ho, Hsuan-Fu; Hung, Chia-Chi

    2008-01-01

    Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…

  9. Influence of birth cohort on age of onset cluster analysis in bipolar I disorder

    DEFF Research Database (Denmark)

    Bauer, M; Glenn, T; Alda, M

    2015-01-01

    Purpose: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset...... cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. Results: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After...... on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more...

  10. Parkinson's Disease Subtypes Identified from Cluster Analysis of Motor and Non-motor Symptoms.

    Science.gov (United States)

    Mu, Jesse; Chaudhuri, Kallol R; Bielza, Concha; de Pedro-Cuesta, Jesus; Larrañaga, Pedro; Martinez-Martin, Pablo

    2017-01-01

    Parkinson's disease is now considered a complex, multi-peptide, central, and peripheral nervous system disorder with considerable clinical heterogeneity. Non-motor symptoms play a key role in the trajectory of Parkinson's disease, from prodromal premotor to end stages. To understand the clinical heterogeneity of Parkinson's disease, this study used cluster analysis to search for subtypes from a large, multi-center, international, and well-characterized cohort of Parkinson's disease patients across all motor stages, using a combination of cardinal motor features (bradykinesia, rigidity, tremor, axial signs) and, for the first time, specific validated rater-based non-motor symptom scales. Two independent international cohort studies were used: (a) the validation study of the Non-Motor Symptoms Scale ( n = 411) and (b) baseline data from the global Non-Motor International Longitudinal Study ( n = 540). k -means cluster analyses were performed on the non-motor and motor domains (domains clustering) and the 30 individual non-motor symptoms alone (symptoms clustering), and hierarchical agglomerative clustering was performed to group symptoms together. Four clusters are identified from the domains clustering supporting previous studies: mild, non-motor dominant, motor-dominant, and severe. In addition, six new smaller clusters are identified from the symptoms clustering, each characterized by clinically-relevant non-motor symptoms. The clusters identified in this study present statistical confirmation of the increasingly important role of non-motor symptoms (NMS) in Parkinson's disease heterogeneity and take steps toward subtype-specific treatment packages.

  11. iterClust: a statistical framework for iterative clustering analysis.

    Science.gov (United States)

    Ding, Hongxu; Wang, Wanxin; Califano, Andrea

    2018-03-22

    In a scenario where populations A, B1 and B2 (subpopulations of B) exist, pronounced differences between A and B may mask subtle differences between B1 and B2. Here we present iterClust, an iterative clustering framework, which can separate more pronounced differences (e.g. A and B) in starting iterations, followed by relatively subtle differences (e.g. B1 and B2), providing a comprehensive clustering trajectory. iterClust is implemented as a Bioconductor R package. andrea.califano@columbia.edu, hd2326@columbia.edu. Supplementary information is available at Bioinformatics online.

  12. Dynamic analysis of clustered building structures using substructures methods

    International Nuclear Information System (INIS)

    Leimbach, K.R.; Krutzik, N.J.

    1989-01-01

    The dynamic substructure approach to the building cluster on a common base mat starts with the generation of Ritz-vectors for each building on a rigid foundation. The base mat plus the foundation soil is subjected to kinematic constraint modes, for example constant, linear, quadratic or cubic constraints. These constraint modes are also imposed on the buildings. By enforcing kinematic compatibility of the complete structural system on the basis of the constraint modes a reduced Ritz model of the complete cluster is obtained. This reduced model can now be analyzed by modal time history or response spectrum methods

  13. Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics

    Science.gov (United States)

    Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.

    2007-01-01

    We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…

  14. Differences Between Ward's and UPGMA Methods of Cluster Analysis: Implications for School Psychology.

    Science.gov (United States)

    Hale, Robert L.; Dougherty, Donna

    1988-01-01

    Compared the efficacy of two methods of cluster analysis, the unweighted pair-groups method using arithmetic averages (UPGMA) and Ward's method, for students grouped on intelligence, achievement, and social adjustment by both clustering methods. Found UPGMA more efficacious based on output, on cophenetic correlation coefficients generated by each…

  15. The use of a cluster analysis in across herd genetic evaluation for ...

    African Journals Online (AJOL)

    To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution ...

  16. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  17. Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

    Science.gov (United States)

    Chan, Julia Y. K.; Bauer, Christopher F.

    2014-01-01

    The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…

  18. Social Learning Network Analysis Model to Identify Learning Patterns Using Ontology Clustering Techniques and Meaningful Learning

    Science.gov (United States)

    Firdausiah Mansur, Andi Besse; Yusof, Norazah

    2013-01-01

    Clustering on Social Learning Network still not explored widely, especially when the network focuses on e-learning system. Any conventional methods are not really suitable for the e-learning data. SNA requires content analysis, which involves human intervention and need to be carried out manually. Some of the previous clustering techniques need…

  19. Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.

    Science.gov (United States)

    Conley, Samantha

    2017-12-01

    The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.

  20. Fiji: an open-source platform for biological-image analysis.

    Science.gov (United States)

    Schindelin, Johannes; Arganda-Carreras, Ignacio; Frise, Erwin; Kaynig, Verena; Longair, Mark; Pietzsch, Tobias; Preibisch, Stephan; Rueden, Curtis; Saalfeld, Stephan; Schmid, Benjamin; Tinevez, Jean-Yves; White, Daniel James; Hartenstein, Volker; Eliceiri, Kevin; Tomancak, Pavel; Cardona, Albert

    2012-06-28

    Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

  1. 3D FEM Analysis of a Pile-Supported Riverine Platform under Environmental Loads Incorporating Soil-Pile Interaction

    Directory of Open Access Journals (Sweden)

    Denise-Penelope N. Kontoni

    2018-01-01

    Full Text Available An existing riverine platform in Egypt, together with its pile group foundation, is analyzed under environmental loads using 3D FEM structural analysis software incorporating soil-pile interaction. The interaction between the transfer plate and the piles supporting the platform is investigated. Two connection conditions were studied assuming fixed or hinged connection between the piles and the reinforced concrete platform for the purpose of comparison of the structural behavior. The analysis showed that the fixed or hinged connection condition between the piles and the platform altered the values and distribution of displacements, normal force, bending moments, and shear forces along the length of each pile. The distribution of piles in the pile group affects the stress distribution on both the soil and platform. The piles were found to suffer from displacement failure rather than force failure. Moreover, the resulting bending stresses on the reinforced concrete plate in the case of a fixed connection between the piles and the platform were almost doubled and much higher than the allowable reinforced concrete stress and even exceeded the ultimate design strength and thus the environmental loads acting on a pile-supported riverine offshore platform may cause collapse if they are not properly considered in the structural analysis and design.

  2. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability.

    Science.gov (United States)

    Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R

    2016-11-01

    To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. © 2016 Associated Professional Sleep Societies, LLC.

  3. The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

    Science.gov (United States)

    Leung, S C; Fung, W K; Wong, K H

    1999-01-01

    The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.

  4. Clustering Analysis for Credit Default Probabilities in a Retail Bank Portfolio

    Directory of Open Access Journals (Sweden)

    Elena ANDREI (DRAGOMIR

    2012-08-01

    Full Text Available Methods underlying cluster analysis are very useful in data analysis, especially when the processed volume of data is very large, so that it becomes impossible to extract essential information, unless specific instruments are used to summarize and structure the gross information. In this context, cluster analysis techniques are used particularly, for systematic information analysis. The aim of this article is to build an useful model for banking field, based on data mining techniques, by dividing the groups of borrowers into clusters, in order to obtain a profile of the customers (debtors and good payers. We assume that a class is appropriate if it contains members that have a high degree of similarity and the standard method for measuring the similarity within a group shows the lowest variance. After clustering, data mining techniques are implemented on the cluster with bad debtors, reaching a very high accuracy after implementation. The paper is structured as follows: Section 2 describes the model for data analysis based on a specific scoring model that we proposed. In section 3, we present a cluster analysis using K-means algorithm and the DM models are applied on a specific cluster. Section 4 shows the conclusions.

  5. expVIP: a Customizable RNA-seq Data Analysis and Visualization Platform.

    Science.gov (United States)

    Borrill, Philippa; Ramirez-Gonzalez, Ricardo; Uauy, Cristobal

    2016-04-01

    The majority of transcriptome sequencing (RNA-seq) expression studies in plants remain underutilized and inaccessible due to the use of disparate transcriptome references and the lack of skills and resources to analyze and visualize these data. We have developed expVIP, an expression visualization and integration platform, which allows easy analysis of RNA-seq data combined with an intuitive and interactive interface. Users can analyze public and user-specified data sets with minimal bioinformatics knowledge using the expVIP virtual machine. This generates a custom Web browser to visualize, sort, and filter the RNA-seq data and provides outputs for differential gene expression analysis. We demonstrate expVIP's suitability for polyploid crops and evaluate its performance across a range of biologically relevant scenarios. To exemplify its use in crop research, we developed a flexible wheat (Triticum aestivum) expression browser (www.wheat-expression.com) that can be expanded with user-generated data in a local virtual machine environment. The open-access expVIP platform will facilitate the analysis of gene expression data from a wide variety of species by enabling the easy integration, visualization, and comparison of RNA-seq data across experiments. © 2016 American Society of Plant Biologists. All Rights Reserved.

  6. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

    Science.gov (United States)

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

    2015-01-01

    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  7. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

    Directory of Open Access Journals (Sweden)

    Huanhuan Li

    2017-08-01

    Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our

  8. Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes.

    Science.gov (United States)

    Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F

    2017-02-01

    Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis.

    Science.gov (United States)

    Li, Huanhuan; Liu, Jingxian; Liu, Ryan Wen; Xiong, Naixue; Wu, Kefeng; Kim, Tai-Hoon

    2017-08-04

    The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with

  10. Molecular-dynamics analysis of mobile helium cluster reactions near surfaces of plasma-exposed tungsten

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Lin; Maroudas, Dimitrios, E-mail: maroudas@ecs.umass.edu [Department of Chemical Engineering, University of Massachusetts, Amherst, Massachusetts 01003-9303 (United States); Hammond, Karl D. [Department of Chemical Engineering, University of Missouri, Columbia, Missouri 65211 (United States); Wirth, Brian D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, Tennessee 37996 (United States)

    2015-10-28

    We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (He{sub n}, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes of helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He{sub 4} and He{sub 5} clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks.

  11. Crowd Analysis by Using Optical Flow and Density Based Clustering

    DEFF Research Database (Denmark)

    Santoro, Francesco; Pedro, Sergio; Tan, Zheng-Hua

    2010-01-01

    In this paper, we present a system to detect and track crowds in a video sequence captured by a camera. In a first step, we compute optical flows by means of pyramidal Lucas-Kanade feature tracking. Afterwards, a density based clustering is used to group similar vectors. In the last step...

  12. VETRA - offline analysis and monitoring software platform for the LHCb Vertex Locator

    International Nuclear Information System (INIS)

    Szumlak, Tomasz

    2010-01-01

    The LHCb experiment is dedicated to studying CP violation and rare decay phenomena. In order to achieve these physics goals precise tracking and vertexing around the interaction point is crucial. This is provided by the VELO (VErtex LOcator) silicon detector. After digitization, FPGAs are employed to run several algorithms to suppress noise and reconstruct clusters. This is performed by an FPGA based processing board. An off-line software project, VETRA, has been developed which performs a bit perfect emulation of this complex processing in the FPGAs. This is a novel development as this hardware emulation is not standalone but rather is fully integrated into the LHCb software to allow the reconstruction of full data from the detector. This software platform facilitates the development and understanding of the behaviour of the processing algorithms, the optimization of the parameters of the algorithms that will be loaded into the FPGA and monitoring of the detector performance. This framework has also been adopted by the Silicon Tracker detector of LHCb. This processing framework was successfully used with the first 1500 tracks of data in the VELO obtained from the first LHC beam in September 2008. The software architecture and utilisation of the VETRA project will be discussed in detail.

  13. Weighted Clustering

    DEFF Research Database (Denmark)

    Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

    2012-01-01

    We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...

  14. Performance comparison analysis library communication cluster system using merge sort

    Science.gov (United States)

    Wulandari, D. A. R.; Ramadhan, M. E.

    2018-04-01

    Begins by using a single processor, to increase the speed of computing time, the use of multi-processor was introduced. The second paradigm is known as parallel computing, example cluster. The cluster must have the communication potocol for processing, one of it is message passing Interface (MPI). MPI have many library, both of them OPENMPI and MPICH2. Performance of the cluster machine depend on suitable between performance characters of library communication and characters of the problem so this study aims to analyze the comparative performances libraries in handling parallel computing process. The case study in this research are MPICH2 and OpenMPI. This case research execute sorting’s problem to know the performance of cluster system. The sorting problem use mergesort method. The research method is by implementing OpenMPI and MPICH2 on a Linux-based cluster by using five computer virtual then analyze the performance of the system by different scenario tests and three parameters for to know the performance of MPICH2 and OpenMPI. These performances are execution time, speedup and efficiency. The results of this study showed that the addition of each data size makes OpenMPI and MPICH2 have an average speed-up and efficiency tend to increase but at a large data size decreases. increased data size doesn’t necessarily increased speed up and efficiency but only execution time example in 100000 data size. OpenMPI has a execution time greater than MPICH2 example in 1000 data size average execution time with MPICH2 is 0,009721 and OpenMPI is 0,003895 OpenMPI can customize communication needs.

  15. Intensive time series data exploitation: the Multi-sensor Evolution Analysis (MEA) platform

    Science.gov (United States)

    Mantovani, Simone; Natali, Stefano; Folegani, Marco; Scremin, Alessandro

    2014-05-01

    The monitoring of the temporal evolution of natural phenomena must be performed in order to ensure their correct description and to allow improvements in modelling and forecast capabilities. This assumption, that is obvious for ground-based measurements, has not always been true for data collected through space-based platforms: except for geostationary satellites and sensors, that allow providing a very effective monitoring of phenomena with geometric scale from regional to global; smaller phenomena (with characteristic dimension lower than few kilometres) have been monitored with instruments that could collect data only with a time interval in the order of several days; bi-temporal techniques have been the most used ones for years, in order to characterise temporal changes and try identifying specific phenomena. The more the number of flying sensor has grown and their performance improved, the more their capability of monitoring natural phenomena at a smaller geographic scale has grown: we can now count on tenth of years of remotely sensed data, collected by hundreds of sensors that are now accessible from a wide users' community, and the techniques for data processing have to be adapted to move toward a data intensive exploitation. Starting from 2008, the European Space Agency has initiated the development of the Multi-sensor Evolution Analysis (MEA) platform (https://mea.eo.esa.int), whose first aim was to permit the access and exploitation of long term remotely sensed satellite data from different platforms: 15 years of global (A)ATSR data together with 5 years of regional AVNIR-2 data were loaded into the system and were used, through a web-based graphic user interface, for land cover change analysis. The MEA data availability has grown during years integrating multi-disciplinary data that feature spatial and temporal dimensions: so far tenths of Terabytes of data in the land and atmosphere domains are available and can be visualized and exploited, keeping the

  16. Phenotypes of asthma in low-income children and adolescents: cluster analysis

    Directory of Open Access Journals (Sweden)

    Anna Lucia Barros Cabral

    Full Text Available ABSTRACT Objective: Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. Methods: We included 306 children and adolescents (6-18 years of age with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Results: Three clusters were defined for our population. Cluster 1 (n = 94 included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87 included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108 included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Conclusions: Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability.

  17. CPSS: a computational platform for the analysis of small RNA deep sequencing data.

    Science.gov (United States)

    Zhang, Yuanwei; Xu, Bo; Yang, Yifan; Ban, Rongjun; Zhang, Huan; Jiang, Xiaohua; Cooke, Howard J; Xue, Yu; Shi, Qinghua

    2012-07-15

    Next generation sequencing (NGS) techniques have been widely used to document the small ribonucleic acids (RNAs) implicated in a variety of biological, physiological and pathological processes. An integrated computational tool is needed for handling and analysing the enormous datasets from small RNA deep sequencing approach. Herein, we present a novel web server, CPSS (a computational platform for the analysis of small RNA deep sequencing data), designed to completely annotate and functionally analyse microRNAs (miRNAs) from NGS data on one platform with a single data submission. Small RNA NGS data can be submitted to this server with analysis results being returned in two parts: (i) annotation analysis, which provides the most comprehensive analysis for small RNA transcriptome, including length distribution and genome mapping of sequencing reads, small RNA quantification, prediction of novel miRNAs, identification of differentially expressed miRNAs, piwi-interacting RNAs and other non-coding small RNAs between paired samples and detection of miRNA editing and modifications and (ii) functional analysis, including prediction of miRNA targeted genes by multiple tools, enrichment of gene ontology terms, signalling pathway involvement and protein-protein interaction analysis for the predicted genes. CPSS, a ready-to-use web server that integrates most functions of currently available bioinformatics tools, provides all the information wanted by the majority of users from small RNA deep sequencing datasets. CPSS is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/db/cpss/index.html or http://mcg.ustc.edu.cn/sdap1/cpss/index.html.

  18. Measurement Model and Precision Analysis of Accelerometers for Maglev Vibration Isolation Platforms

    Directory of Open Access Journals (Sweden)

    Qianqian Wu

    2015-08-01

    Full Text Available High precision measurement of acceleration levels is required to allow active control for vibration isolation platforms. It is necessary to propose an accelerometer configuration measurement model that yields such a high measuring precision. In this paper, an accelerometer configuration to improve measurement accuracy is proposed. The corresponding calculation formulas of the angular acceleration were derived through theoretical analysis. A method is presented to minimize angular acceleration noise based on analysis of the root mean square noise of the angular acceleration. Moreover, the influence of installation position errors and accelerometer orientation errors on the calculation precision of the angular acceleration is studied. Comparisons of the output differences between the proposed configuration and the previous planar triangle configuration under the same installation errors are conducted by simulation. The simulation results show that installation errors have a relatively small impact on the calculation accuracy of the proposed configuration. To further verify the high calculation precision of the proposed configuration, experiments are carried out for both the proposed configuration and the planar triangle configuration. On the basis of the results of simulations and experiments, it can be concluded that the proposed configuration has higher angular acceleration calculation precision and can be applied to different platforms.

  19. Measurement Model and Precision Analysis of Accelerometers for Maglev Vibration Isolation Platforms.

    Science.gov (United States)

    Wu, Qianqian; Yue, Honghao; Liu, Rongqiang; Zhang, Xiaoyou; Ding, Liang; Liang, Tian; Deng, Zongquan

    2015-08-14

    High precision measurement of acceleration levels is required to allow active control for vibration isolation platforms. It is necessary to propose an accelerometer configuration measurement model that yields such a high measuring precision. In this paper, an accelerometer configuration to improve measurement accuracy is proposed. The corresponding calculation formulas of the angular acceleration were derived through theoretical analysis. A method is presented to minimize angular acceleration noise based on analysis of the root mean square noise of the angular acceleration. Moreover, the influence of installation position errors and accelerometer orientation errors on the calculation precision of the angular acceleration is studied. Comparisons of the output differences between the proposed configuration and the previous planar triangle configuration under the same installation errors are conducted by simulation. The simulation results show that installation errors have a relatively small impact on the calculation accuracy of the proposed configuration. To further verify the high calculation precision of the proposed configuration, experiments are carried out for both the proposed configuration and the planar triangle configuration. On the basis of the results of simulations and experiments, it can be concluded that the proposed configuration has higher angular acceleration calculation precision and can be applied to different platforms.

  20. Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

    Directory of Open Access Journals (Sweden)

    Rita Ismayilova

    2014-01-01

    Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.

  1. Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

    Science.gov (United States)

    Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

    2017-01-01

    Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).

  2. Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability

    Science.gov (United States)

    Miller, Christopher B.; Bartlett, Delwyn J.; Mullins, Anna E.; Dodds, Kirsty L.; Gordon, Christopher J.; Kyle, Simon D.; Kim, Jong Won; D'Rozario, Angela L.; Lee, Rico S.C.; Comas, Maria; Marshall, Nathaniel S.; Yee, Brendon J.; Espie, Colin A.; Grunstein, Ronald R.

    2016-01-01

    Study Objectives: To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative (q)-EEG and heart rate variability (HRV). Methods: Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. Results: From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q-EEG. Clinical Trial Registration: Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. Citation: Miller CB, Bartlett DJ, Mullins AE, Dodds KL, Gordon CJ, Kyle SD, Kim JW, D'Rozario AL, Lee RS, Comas M, Marshall NS, Yee BJ, Espie CA, Grunstein RR. Clusters of Insomnia Disorder: an exploratory cluster analysis of objective sleep parameters reveals differences in neurocognitive functioning, quantitative EEG, and heart rate variability. SLEEP 2016;39(11):1993–2004. PMID:27568796

  3. Cluster analysis and ecology of living benthonic foraminiferids from inner shelf off Ratnagiri, West Coast, India

    Digital Repository Service at National Institute of Oceanography (India)

    Nigam, R.; Sarupria, J.S.

    Q-mode cluster analysis explains the spatial distribution data of living benthonic foraminiferids from the inner shelf off Ratnagiri. Two main biotopes and two sub-biotopes are revognised within the study area; biotope A, characterised by @i...

  4. Statistical Techniques Applied to Aerial Radiometric Surveys (STAARS): cluster analysis. National Uranium Resource Evaluation

    International Nuclear Information System (INIS)

    Pirkle, F.L.; Stablein, N.K.; Howell, J.A.; Wecksung, G.W.; Duran, B.S.

    1982-11-01

    One objective of the aerial radiometric surveys flown as part of the US Department of Energy's National Uranium Resource Evaluation (NURE) program was to ascertain the regional distribution of near-surface radioelement abundances. Some method for identifying groups of observations with similar radioelement values was therefore required. It is shown in this report that cluster analysis can identify such groups even when no a priori knowledge of the geology of an area exists. A method of convergent k-means cluster analysis coupled with a hierarchical cluster analysis is used to classify 6991 observations (three radiometric variables at each observation location) from the Precambrian rocks of the Copper Mountain, Wyoming, area. Another method, one that combines a principal components analysis with a convergent k-means analysis, is applied to the same data. These two methods are compared with a convergent k-means analysis that utilizes available geologic knowledge. All three methods identify four clusters. Three of the clusters represent background values for the Precambrian rocks of the area, and one represents outliers (anomalously high 214 Bi). A segmentation of the data corresponding to geologic reality as discovered by other methods has been achieved based solely on analysis of aerial radiometric data. The techniques employed are composites of classical clustering methods designed to handle the special problems presented by large data sets. 20 figures, 7 tables

  5. Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033

    Science.gov (United States)

    Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.

    2018-04-01

    Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.

  6. Subtypes of autism by cluster analysis based on structural MRI data.

    Science.gov (United States)

    Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas

    2005-05-01

    The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

  7. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....

  8. Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Paul; Eles, Petru; Peng, Zebo

    2003-01-01

    An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....

  9. Clustering applications in financial and economic analysis of the crop production in the Russian regions

    Directory of Open Access Journals (Sweden)

    Gromov Vladislav Vladimirovich

    2013-08-01

    Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.

  10. Multi-scale analysis of nuclear reactor thermal-hydraulics-first applications using the NEPTUNE platform

    International Nuclear Information System (INIS)

    Guelfi, A.; Boucker, M.; Mimouni, S.; Bestion, D.; Boudier, P.

    2005-01-01

    The NEPTUNE project aims at building a new two-phase flow thermal-hydraulics platform for nuclear reactor simulation. EDF (Electricite de France) and CEA (Commissariat a l'Energie Atomique) with the co-sponsorship of IRSN (Institut de Radioprotection et Surete Nucleaire) and FRAMATOME-ANP, are jointly developing the NEPTUNE multi-scale platform that includes new physical models and numerical methods for each of the computing scales. One usually distinguishes three different scales for industrial simulations: the 'system' scale, the 'component' scale (subchannel analysis) and CFD (Computational Fluid Dynamics). In addition DNS (Direct Numerical Simulation) can provide information at a smaller scale that can be useful for the development of the averaged scales. The NEPTUNE project also includes work on software architecture and research on new numerical methods for coupling codes since both are required to improve industrial calculations. All these R and D challenges have been defined in order to meet industrial needs and the underlying stakes (mainly the competitiveness and the safety of Nuclear Power Plants). This paper focuses on three high priority needs: DNB (Departure from Nucleate Boiling) prediction, directly linked to fuel performance; PTS (Pressurized Thermal Shock), a key issue when studying the lifespan of critical components and LBLOCA (Large Break Loss of Coolant Accident), a reference accident for safety studies. For each of these industrial applications, we provide a review of the last developments within the NEPTUNE platform and we present the first results. A particular attention is also given to physical validation and the needs for further experimental data. (authors)

  11. FLOCK cluster analysis of mast cell event clustering by high-sensitivity flow cytometry predicts systemic mastocytosis.

    Science.gov (United States)

    Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty

    2015-11-01

    In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.

  12. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    International Nuclear Information System (INIS)

    Harmon, S; Wendelberger, B; Jeraj, R

    2014-01-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [ 18 F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI mean = 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI range : 0.2301–1). Conclusion: Using commonly-used clustering algorithms, we found

  13. SU-E-J-98: Radiogenomics: Correspondence Between Imaging and Genetic Features Based On Clustering Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Harmon, S; Wendelberger, B [University of Wisconsin-Madison, Madison, WI (United States); Jeraj, R [University of Wisconsin-Madison, Madison, WI (United States); University of Ljubljana (Slovenia)

    2014-06-01

    Purpose: Radiogenomics aims to establish relationships between patient genotypes and imaging phenotypes. An open question remains on how best to integrate information from these distinct datasets. This work investigates if similarities in genetic features across patients correspond to similarities in PET-imaging features, assessed with various clustering algorithms. Methods: [{sup 18}F]FDG PET data was obtained for 26 NSCLC patients from a public database (TCIA). Tumors were contoured using an in-house segmentation algorithm combining gradient and region-growing techniques; resulting ROIs were used to extract 54 PET-based features. Corresponding genetic microarray data containing 48,778 elements were also obtained for each tumor. Given mismatch in feature sizes, two dimension reduction techniques were also applied to the genetic data: principle component analysis (PCA) and selective filtering of 25 NSCLC-associated genes-ofinterest (GOI). Gene datasets (full, PCA, and GOI) and PET feature datasets were independently clustered using K-means and hierarchical clustering using variable number of clusters (K). Jaccard Index (JI) was used to score similarity of cluster assignments across different datasets. Results: Patient clusters from imaging data showed poor similarity to clusters from gene datasets, regardless of clustering algorithms or number of clusters (JI{sub mean}= 0.3429±0.1623). Notably, we found clustering algorithms had different sensitivities to data reduction techniques. Using hierarchical clustering, the PCA dataset showed perfect cluster agreement to the full-gene set (JI =1) for all values of K, and the agreement between the GOI set and the full-gene set decreased as number of clusters increased (JI=0.9231 and 0.5769 for K=2 and 5, respectively). K-means clustering assignments were highly sensitive to data reduction and showed poor stability for different values of K (JI{sub range}: 0.2301–1). Conclusion: Using commonly-used clustering algorithms

  14. Design, Analysis, Hybrid Testing and Orientation Control of a Floating Platform with Counter-Rotating Vertical-Axis Wind Turbines

    Science.gov (United States)

    Kanner, Samuel Adam Chinman

    The design and operation of two counter-rotating vertical-axis wind turbines on a floating, semi-submersible platform is studied. The technology, called the Multiple Integrated and Synchronized Turbines (MIST) platform has the potential to reduce the cost of offshore wind energy per unit of installed capacity. Attached to the platform are closely-spaced, counter-rotating turbines, which can achieve a higher power density per planform area because of synergistic interaction effects. The purpose of the research is to control the orientation of the platform and rotational speeds of the turbines by modifying the energy absorbed by each of the generators of the turbines. To analyze the various aspects of the platform and wind turbines, the analysis is drawn from the fields of hydrodynamics, electromagnetics, aerodynamics and control theory. To study the hydrodynamics of the floating platform in incident monochromatic waves, potential theory is utilized, taking into account the slow-drift yaw motion of the platform. Steady, second-order moments that are spatially dependent (i.e., dependent on the platform's yaw orientation relative to the incident waves) are given special attention since there are no natural restoring yaw moment. The aerodynamics of the counter-rotating turbines are studied in collaboration with researchers at the UC Berkeley Mathematics Department using a high-order, implicit, large-eddy simulation. An element flipping technique is utilized to extend the method to a domain with counter-rotating turbines and the effects from the closely-spaced turbines is compared with existing experimental data. Hybrid testing techniques on a model platform are utilized to prove the controllability of the platform in lieu of a wind-wave tank. A 1:82 model-scale floating platform is fabricated and tested at the UC Berkeley Physical-Model Testing Facility. The vertical-axis wind turbines are simulated by spinning, controllable actuators that can be updated in real-time of

  15. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  16. Tunable Platform Tolerant Antenna Design for RFID and IoT Applications Using Characteristic Mode Analysis

    Directory of Open Access Journals (Sweden)

    Abubakar Sharif

    2018-01-01

    Full Text Available Radio frequency identification (RFID is a key technology to realize IoT (Internet of Things dreams. RFID technology has been emerging in sensing, identification, tracking, and localization of goods. In order to tag a huge number of things, it is cost-effective to use one RFID antenna for tagging different things. Therefore, in this paper a platform tolerant RFID tag antenna with tunable capability is proposed. The proposed tag antenna is designed and optimized using characteristic mode analysis (CMA. Moreover, this tag antenna consists of a folded patch wrapped around FR 4 substrate and a feeding loop element printed on a paper substrate. The inductive feeding loop is stacked over folded patch and it provides impedance match with RFID chip. Because of separate radiating and feeding element, this tag antenna has a versatility of impedance matching with any RFID chip. Furthermore, this tag is able to cover American RFID band (902–928 MHz and can be tuned to European RFID band (865–868 MHz by adding tunable strips. In order to demonstrate platform tolerant operation, the read range of RFID tag is measured by mounting it on different materials. The maximum read range of RFID tag is 4.5 m in free space or on dielectrics and 6.5 m above 200 × 200 mm2 metal plate, respectively.

  17. A cluster randomized trial to determine the effectiveness of a novel, digital pendant and voice reminder platform on increasing infant immunization adherence in rural Udaipur, India.

    Science.gov (United States)

    Nagar, Ruchit; Venkat, Preethi; Stone, Logan D; Engel, Kyle A; Sadda, Praneeth; Shahnawaz, Mohammed

    2017-11-18

    Five hundred thousand children under the age of 5 die from vaccine preventable diseases in India every year. More than just improving coverage, increasing timeliness of immunizations is critical to ensuring infant health in the first year of life. Novel, culturally appropriate community engagement strategies are worth exploring to close the immunization gap. In our study, a digital NFC (Near Field Communication) pendant worn on black thread and voice call reminder system was tested for the effectiveness in improving DTP3 adherence within 2 monthly camps from DTP1 administration. A cluster randomized controlled trial was conducted in which 96 village health camps were randomized to 3 arms: NFC sticker, NFC pendant, and NFC pendant with voice call reminder in local dialect. Randomization was done across 5 blocks in the Udaipur District serviced by Seva Mandir from August 2015 to April 2016. In terms of our three primary outcomes related to DTP3 adherence, point estimates show conflicting results. Two outcomes presented adherence in the control. DTP3 completion within two camps after DTP1 showed higher adherence in the Control (Sticker) (74.2%) arm compared to the Pendant (67.2%) and Pendant and Voice arms (69.3%). Likewise, the estimate for DTP3 completion within 180 days of birth in the Control (Sticker) (69.4%) arm was higher than estimates in the Pendant (57.4%) and Pendant and Voice arms (58.7%). However, one outcome displayed higher adherence in the intervention. DTP3 completion within two months from the time of registration was higher in the Pendant (37.7%) and Pendant and Voice arms (38.7%) compared to the Control (Sticker) arm (27.4%). In all primary outcomes, differences in adherence were statistically insignificant both before and after controlling for confounding factors. In terms of secondary outcomes, our results suggest that providing a necklace generated significant community discussion (H = 8.8796, df = 2, p = .0118), had strong

  18. The dynamics of cyclone clustering in re-analysis and a high-resolution climate model

    Science.gov (United States)

    Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len

    2017-04-01

    Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.

  19. Positron Emission Tomography Particle tracking using cluster analysis

    International Nuclear Information System (INIS)

    Gundogdu, O.

    2004-01-01

    Positron Emission Particle Tracking was successfully used in a wide range of industrial applications. This technique primarily uses a single positron emitting tracer particle. However, using multiple particles would provide more comparative information about the physical processes taking place in a system such as mixing or fluidised beds. In this paper, a unique method that enables us to track more than one particle is presented. This method is based on the midpoint of the closest distance between two trajectories or coincidence vectors. The technique presented in this paper employs a clustering method

  20. Positron Emission Tomography Particle tracking using cluster analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gundogdu, O. [University of Birmingham, School of Physics and Astronomy, Birmingham, B15 2TT (United Kingdom)]. E-mail: o.gundogdu@surrey.ac.uk

    2004-12-01

    Positron Emission Particle Tracking was successfully used in a wide range of industrial applications. This technique primarily uses a single positron emitting tracer particle. However, using multiple particles would provide more comparative information about the physical processes taking place in a system such as mixing or fluidised beds. In this paper, a unique method that enables us to track more than one particle is presented. This method is based on the midpoint of the closest distance between two trajectories or coincidence vectors. The technique presented in this paper employs a clustering method.

  1. Communication Base Station Log Analysis Based on Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Shao-Hua

    2017-01-01

    Full Text Available Communication base stations generate massive data every day, these base station logs play an important value in mining of the business circles. This paper use data mining technology and hierarchical clustering algorithm to group the scope of business circle for the base station by recording the data of these base stations.Through analyzing the data of different business circle based on feature extraction and comparing different business circle category characteristics, which can choose a suitable area for operators of commercial marketing.

  2. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    Science.gov (United States)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  3. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

    Directory of Open Access Journals (Sweden)

    Chao Zhang

    2017-09-01

    Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  4. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.

    Science.gov (United States)

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

    2017-09-27

    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  5. Point Cluster Analysis Using a 3D Voronoi Diagram with Applications in Point Cloud Segmentation

    Directory of Open Access Journals (Sweden)

    Shen Ying

    2015-08-01

    Full Text Available Three-dimensional (3D point analysis and visualization is one of the most effective methods of point cluster detection and segmentation in geospatial datasets. However, serious scattering and clotting characteristics interfere with the visual detection of 3D point clusters. To overcome this problem, this study proposes the use of 3D Voronoi diagrams to analyze and visualize 3D points instead of the original data item. The proposed algorithm computes the cluster of 3D points by applying a set of 3D Voronoi cells to describe and quantify 3D points. The decompositions of point cloud of 3D models are guided by the 3D Voronoi cell parameters. The parameter values are mapped from the Voronoi cells to 3D points to show the spatial pattern and relationships; thus, a 3D point cluster pattern can be highlighted and easily recognized. To capture different cluster patterns, continuous progressive clusters and segmentations are tested. The 3D spatial relationship is shown to facilitate cluster detection. Furthermore, the generated segmentations of real 3D data cases are exploited to demonstrate the feasibility of our approach in detecting different spatial clusters for continuous point cloud segmentation.

  6. Detection of secondary structure elements in proteins by hydrophobic cluster analysis.

    Science.gov (United States)

    Woodcock, S; Mornon, J P; Henrissat, B

    1992-10-01

    Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.

  7. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    Science.gov (United States)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  8. Using the Cluster Analysis and the Principal Component Analysis in Evaluating the Quality of a Destination

    Directory of Open Access Journals (Sweden)

    Ida Vajčnerová

    2016-01-01

    Full Text Available The objective of the paper is to explore possibilities of evaluating the quality of a tourist destination by means of the principal components analysis (PCA and the cluster analysis. In the paper both types of analysis are compared on the basis of the results they provide. The aim is to identify advantage and limits of both methods and provide methodological suggestion for their further use in the tourism research. The analyses is based on the primary data from the customers’ satisfaction survey with the key quality factors of a destination. As output of the two statistical methods is creation of groups or cluster of quality factors that are similar in terms of respondents’ evaluations, in order to facilitate the evaluation of the quality of tourist destinations. Results shows the possibility to use both tested methods. The paper is elaborated in the frame of wider research project aimed to develop a methodology for the quality evaluation of tourist destinations, especially in the context of customer satisfaction and loyalty.

  9. The CERN analysis facility-a PROOF cluster for day-one physics analysis

    International Nuclear Information System (INIS)

    Grosse-Oetringhaus, J F

    2008-01-01

    ALICE (A Large Ion Collider Experiment) at the LHC plans to use a PROOF cluster at CERN (CAF - CERN Analysis Facility) for analysis. The system is especially aimed at the prototyping phase of analyses that need a high number of development iterations and thus require a short response time. Typical examples are the tuning of cuts during the development of an analysis as well as calibration and alignment. Furthermore, the use of an interactive system with very fast response will allow ALICE to extract physics observables out of first data quickly. An additional use case is fast event simulation and reconstruction. A test setup consisting of 40 machines is used for evaluation since May 2006. The PROOF system enables the parallel processing and xrootd the access to files distributed on the test cluster. An automatic staging system for files either catalogued in the ALICE file catalog or stored in the CASTOR mass storage system has been developed. The current setup and ongoing development towards disk quotas and CPU fairshare are described. Furthermore, the integration of PROOF into ALICE's software framework (AliRoot) is discussed

  10. The Perseus computational platform for comprehensive analysis of (prote)omics data.

    Science.gov (United States)

    Tyanova, Stefka; Temu, Tikira; Sinitcyn, Pavel; Carlson, Arthur; Hein, Marco Y; Geiger, Tamar; Mann, Matthias; Cox, Jürgen

    2016-09-01

    A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.

  11. Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster

    Energy Technology Data Exchange (ETDEWEB)

    Allada, Veerendra, Benjegerdes, Troy; Bode, Brett

    2009-08-31

    Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as the workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.

  12. Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster

    International Nuclear Information System (INIS)

    Allada, Veerendra; Benjegerdes, Troy; Bode, Brett

    2009-01-01

    Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as the workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.

  13. Clustering-based analysis for residential district heating data

    DEFF Research Database (Denmark)

    Gianniou, Panagiota; Liu, Xiufeng; Heller, Alfred

    2018-01-01

    The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze r....... These findings will be valuable for district heating utilities and energy planners to optimize their operations, design demand-side management strategies, and develop targeting energy-efficiency programs or policies.......The wide use of smart meters enables collection of a large amount of fine-granular time series, which can be used to improve the understanding of consumption behavior and used for consumption optimization. This paper presents a clustering-based knowledge discovery in databases method to analyze...... residential heating consumption data and evaluate information included in national building databases. The proposed method uses the K-means algorithm to segment consumption groups based on consumption intensity and representative patterns and ranks the groups according to daily consumption. This paper also...

  14. Cluster cosmological analysis with X ray instrumental observables: introduction and testing of AsPIX method

    International Nuclear Information System (INIS)

    Valotti, Andrea

    2016-01-01

    Cosmology is one of the fundamental pillars of astrophysics, as such it contains many unsolved puzzles. To investigate some of those puzzles, we analyze X-ray surveys of galaxy clusters. These surveys are possible thanks to the bremsstrahlung emission of the intra-cluster medium. The simultaneous fit of cluster counts as a function of mass and distance provides an independent measure of cosmological parameters such as Ω m , σ s , and the dark energy equation of state w0. A novel approach to cosmological analysis using galaxy cluster data, called top-down, was developed in N. Clerc et al. (2012). This top-down approach is based purely on instrumental observables that are considered in a two-dimensional X-ray color-magnitude diagram. The method self-consistently includes selection effects and scaling relationships. It also provides a means of bypassing the computation of individual cluster masses. My work presents an extension of the top-down method by introducing the apparent size of the cluster, creating a three-dimensional X-ray cluster diagram. The size of a cluster is sensitive to both the cluster mass and its angular diameter, so it must also be included in the assessment of selection effects. The performance of this new method is investigated using a Fisher analysis. In parallel, I have studied the effects of the intrinsic scatter in the cluster size scaling relation on the sample selection as well as on the obtained cosmological parameters. To validate the method, I estimate uncertainties of cosmological parameters with MCMC method Amoeba minimization routine and using two simulated XMM surveys that have an increasing level of complexity. The first simulated survey is a set of toy catalogues of 100 and 10000 deg 2 , whereas the second is a 1000 deg 2 catalogue that was generated using an Aardvark semi-analytical N-body simulation. This comparison corroborates the conclusions of the Fisher analysis. In conclusion, I find that a cluster diagram that accounts

  15. Behavioral Health Risk Profiles of Undergraduate University Students in England, Wales, and Northern Ireland: A Cluster Analysis.

    Science.gov (United States)

    El Ansari, Walid; Ssewanyana, Derrick; Stock, Christiane

    2018-01-01

    Limited research has explored clustering of lifestyle behavioral risk factors (BRFs) among university students. This study aimed to explore clustering of BRFs, composition of clusters, and the association of the clusters with self-rated health and perceived academic performance. We assessed (BRFs), namely tobacco smoking, physical inactivity, alcohol consumption, illicit drug use, unhealthy nutrition, and inadequate sleep, using a self-administered general Student Health Survey among 3,706 undergraduates at seven UK universities. A two-step cluster analysis generated: Cluster 1 (the high physically active and health conscious) with very high health awareness/consciousness, good nutrition, and physical activity (PA), and relatively low alcohol, tobacco, and other drug (ATOD) use. Cluster 2 (the abstinent) had very low ATOD use, high health awareness, good nutrition, and medium high PA. Cluster 3 (the moderately health conscious) included the highest regard for healthy eating, second highest fruit/vegetable consumption, and moderately high ATOD use. Cluster 4 (the risk taking) showed the highest ATOD use, were the least health conscious, least fruit consuming, and attached the least importance on eating healthy. Compared to the healthy cluster (Cluster 1), students in other clusters had lower self-rated health, and particularly, students in the risk taking cluster (Cluster 4) reported lower academic performance. These associations were stronger for men than for women. Of the four clusters, Cluster 4 had the youngest students. Our results suggested that prevention among university students should address multiple BRFs simultaneously, with particular focus on the younger students.

  16. Application of Cluster Analysis in Assessment of Dietary Habits of Secondary School Students

    Directory of Open Access Journals (Sweden)

    Zalewska Magdalena

    2014-12-01

    Full Text Available Maintenance of proper health and prevention of diseases of civilization are now significant public health problems. Nutrition is an important factor in the development of youth, as well as the current and future state of health. The aim of the study was to show the benefits of the application of cluster analysis to assess the dietary habits of high school students. The survey was carried out on 1,631 eighteen-year-old students in seven randomly selected secondary schools in Bialystok using a self-prepared anonymous questionnaire. An evaluation of the time of day meals were eaten and the number of meals consumed was made for the surveyed students. The cluster analysis allowed distinguishing characteristic structures of dietary habits in the observed population. Four clusters were identified, which were characterized by relative internal homogeneity and substantial variation in terms of the number of meals during the day and the time of their consumption. The most important characteristics of cluster 1 were cumulated food ration in 2 or 3 meals and long intervals between meals. Cluster 2 was characterized by eating the recommended number of 4 or 5 meals a day. In the 3rd cluster, students ate 3 meals a day with large intervals between them, and in the 4th they had four meals a day while maintaining proper intervals between them. In all clusters dietary mistakes occurred, but most of them were related to clusters 1 and 3. Cluster analysis allowed for the identification of major flaws in nutrition, which may include irregular eating and skipping meals, and indicated possible connections between eating patterns and disturbances of body weight in the examined population.

  17. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    Science.gov (United States)

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson

  18. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Directory of Open Access Journals (Sweden)

    Valentina Meuti

    2014-01-01

    Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

  19. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    Science.gov (United States)

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

    2014-01-01

    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  20. Somatosensory nociceptive characteristics differentiate subgroups in people with chronic low back pain: a cluster analysis.

    Science.gov (United States)

    Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne

    2015-10-01

    The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.

  1. Hadoop and friends - first experience at CERN with a new platform for high throughput analysis steps

    Science.gov (United States)

    Duellmann, D.; Surdy, K.; Menichetti, L.; Toebbicke, R.

    2017-10-01

    The statistical analysis of infrastructure metrics comes with several specific challenges, including the fairly large volume of unstructured metrics from a large set of independent data sources. Hadoop and Spark provide an ideal environment in particular for the first steps of skimming rapidly through hundreds of TB of low relevance data to find and extract the much smaller data volume that is relevant for statistical analysis and modelling. This presentation will describe the new Hadoop service at CERN and the use of several of its components for high throughput data aggregation and ad-hoc pattern searches. We will describe the hardware setup used, the service structure with a small set of decoupled clusters and the first experience with co-hosting different applications and performing software upgrades. We will further detail the common infrastructure used for data extraction and preparation from continuous monitoring and database input sources.

  2. Based on the rainfall system platform raindrops research and analysis of pressure loss

    Science.gov (United States)

    Cao, Gang; Sun, Jian

    2018-01-01

    With the rapid development of China’s military career, land, sea and air force all services and equipment of modern equipment need to be in the rain test, and verify its might suffer during transportation, storage or use a different environment temperature lower water or use underwater, the water is derived from the heavy rain, the wind and rain, sprinkler system, splash water, water wheel, a violent shock waves or use underwater, etcTest the product performance and quality, under the condition of rainfall system platform in the process of development, how to control the raindrops pressure loss becomes the key to whether the system can simulate the real rainfall [1], this paper is according to the rainfall intensity, nozzle flow resistance, meet water flow of rain pressure loss calculation and analysis, and system arrangement of the optimal solution of rainfall is obtained [2].

  3. CoreFlow: A computational platform for integration, analysis and modeling of complex biological data

    DEFF Research Database (Denmark)

    Pasculescu, Adrian; Schoof, Erwin; Creixell, Pau

    2014-01-01

    between data generation, analysis and manuscript writing. CoreFlow is being released to the scientific community as an open-sourced software package complete with proteomics-specific examples, which include corrections for incomplete isotopic labeling of peptides (SILAC) or arginine-to-proline conversion......A major challenge in mass spectrometry and other large-scale applications is how to handle, integrate, and model the data that is produced. Given the speed at which technology advances and the need to keep pace with biological experiments, we designed a computational platform, CoreFlow, which...... provides programmers with a framework to manage data in real-time. It allows users to upload data into a relational database (MySQL), and to create custom scripts in high-level languages such as R, Python, or Perl for processing, correcting and modeling this data. CoreFlow organizes these scripts...

  4. Efficiency of Airborne Sample Analysis Platform (ASAP Bioaerosol Sampler for Pathogen Detection

    Directory of Open Access Journals (Sweden)

    Anurag eSharma

    2015-05-01

    Full Text Available The threat of bioterrorism and pandemics has highlighted the urgency for rapid and reliable bioaerosol detection in different environments. Safeguarding against such threats requires continuous sampling of the ambient air for pathogen detection. In this study we investigated the efficacy of the Airborne Sample Analysis Platform (ASAP 2800 bioaerosol sampler to collect representative samples of air and identify specific viruses suspended as bioaerosols. To test this concept, we aerosolized an innocuous replication-defective bovine adenovirus serotype 3 (BAdV3 in a controlled laboratory environment. The ASAP efficiently trapped the surrogate virus at 5×10E3 plaque-forming units (p.f.u. [2×10E5 genome copy equivalent] concentrations or more resulting in the successful detection of the virus using quantitative PCR. These results support the further development of ASAP for bioaerosol pathogen detection.

  5. Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals.

    Science.gov (United States)

    Su, Fei; Xu, Ping

    2014-01-29

    Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species.

  6. Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.

    Science.gov (United States)

    Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J

    2017-12-01

    Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the

  7. Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.

    Science.gov (United States)

    Li, Zhaonan; Xu, Xinyi; Shen, Junshan

    2017-11-10

    In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.

  8. A formal concept analysis approach to consensus clustering of multi-experiment expression data

    Science.gov (United States)

    2014-01-01

    Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological

  9. A PLC platform-independent structural analysis on FBD programs for digital reactor protection systems

    International Nuclear Information System (INIS)

    Jung, Sejin; Yoo, Junbeom; Lee, Young-Jun

    2017-01-01

    Highlights: • FBD has been widely used to implement safety-critical software for PLC-based systems. • The safety-critical software should be developed strictly with safety programming guidelines. • There are no argued rules that have specific links to higher guidelines NUREG/CR-6463 PLC platform-independently. • This paper proposes a set of rules on the structure of FBD programs with providing specific links to higher guidelines. • This paper also provides CASE tool ‘FBD Checker’ for analyzing the structure of FBD. - Abstract: FBD (function block diagram) has been widely used to implement safety-critical software for PLC (programmable logic controller)-based digital nuclear reactor protection systems. The software should be developed strictly in accordance with safety programming guidelines such as NUREG/CR-6463. Software engineering tools of PLC vendors enable us to present structural analyses using FBD programs, but specific rules pertaining to the guidelines are enclosed within the commercial tools, and specific links to the guidelines are not clearly communicated. This paper proposes a set of rules on the structure of FBD programs in accordance with guidelines, and we develop an automatic analysis tool for FBD programs written in the PLCopen TC6 format. With the proposed tool, any FBD program that is transformed into an open format can be analyzed the PLC platform-independently. We consider a case study on FBD programs obtained from a preliminary version of a Korean nuclear power plant, and we demonstrate the effectiveness and potential of the proposed rules and analysis tool.

  10. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Directory of Open Access Journals (Sweden)

    Wenning Zheng

    Full Text Available The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC tool and Pathogenomic Profiling Tool (PathoProT, which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  11. Analysis of mice tumor models using dynamic MRI data and a dedicated software platform

    Energy Technology Data Exchange (ETDEWEB)

    Alfke, H.; Maurer, E.; Klose, K.J. [Philipps Univ. Marburg (Germany). Dept. of Radiology; Kohle, S.; Rascher-Friesenhausen, R.; Behrens, S.; Peitgen, H.O. [MeVis - Center for Medical Diagnostic Systems and Visualization, Bremen (Germany); Celik, I. [Philipps Univ. Marburg (Germany). Inst. for Theoretical Surgery; Heverhagen, J.T. [Philipps Univ. Marburg (Germany). Dept. of Radiology; Ohio State Univ., Columbus (United States). Dept. of Radiology

    2004-09-01

    Purpose: To implement a software platform (DynaVision) dedicated to analyze data from functional imaging of tumors with different mathematical approaches, and to test the software platform in pancreatic carcinoma xenografts in mice with severe combined immunodeficiency disease (SCID). Materials and Methods: A software program was developed for extraction and visualization of tissue perfusion parameters from dynamic contrast-enhanced images. This includes regional parameter calculation from enhancement curves, parametric images (e.g., blood flow), animation, 3D visualization, two-compartment modeling a mode for comparing different datasets (e.g., therapy monitoring), and motion correction. We analyzed xenograft tumors from two pancreatic carcinoma cell lines (B x PC3 and ASPC1) implanted in 14 SCID mice after injection of Gd-DTPA into the tail vein. These data were correlated with histopathological findings. Results: Image analysis was completed in approximately 15 minutes per data set. The possibility of drawing and editing ROIs within the whole data set makes it easy to obtain quantitative data from the intensity-time curves. In one animal, motion artifacts reduced the image quality to a greater extent but data analysis was still possible after motion correction. Dynamic MRI of mice tumor models revealed a highly heterogeneous distribution of the contrast-enhancement curves and derived parameters, which correlated with differences in histopathology. ASPc1 tumors showed a more hypervascular type of curves with faster and higher signal enhancement rate (wash-in) and a faster signal decrease (wash-out). BXPC3 tumors showed a more hypovascular type with slower wash-in and wash-out. This correlated with the biological properties of the tumors. (orig.)

  12. Analysis of mice tumor models using dynamic MRI data and a dedicated software platform

    International Nuclear Information System (INIS)

    Alfke, H.; Maurer, E.; Klose, K.J.; Celik, I.; Heverhagen, J.T.; Ohio State Univ., Columbus

    2004-01-01

    Purpose: To implement a software platform (DynaVision) dedicated to analyze data from functional imaging of tumors with different mathematical approaches, and to test the software platform in pancreatic carcinoma xenografts in mice with severe combined immunodeficiency disease (SCID). Materials and Methods: A software program was developed for extraction and visualization of tissue perfusion parameters from dynamic contrast-enhanced images. This includes regional parameter calculation from enhancement curves, parametric images (e.g., blood flow), animation, 3D visualization, two-compartment modeling a mode for comparing different datasets (e.g., therapy monitoring), and motion correction. We analyzed xenograft tumors from two pancreatic carcinoma cell lines (B x PC3 and ASPC1) implanted in 14 SCID mice after injection of Gd-DTPA into the tail vein. These data were correlated with histopathological findings. Results: Image analysis was completed in approximately 15 minutes per data set. The possibility of drawing and editing ROIs within the whole data set makes it easy to obtain quantitative data from the intensity-time curves. In one animal, motion artifacts reduced the image quality to a greater extent but data analysis was still possible after motion correction. Dynamic MRI of mice tumor models revealed a highly heterogeneous distribution of the contrast-enhancement curves and derived parameters, which correlated with differences in histopathology. ASPc1 tumors showed a more hypervascular type of curves with faster and higher signal enhancement rate (wash-in) and a faster signal decrease (wash-out). BXPC3 tumors showed a more hypovascular type with slower wash-in and wash-out. This correlated with the biological properties of the tumors. (orig.)

  13. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Science.gov (United States)

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  14. A critical cluster analysis of 44 indicators of author-level performance

    DEFF Research Database (Denmark)

    Wildgaard, Lorna Elizabeth

    2016-01-01

    -four indicators of individual researcher performance were computed using the data. The clustering solution was supported by continued reference to the researcher’s curriculum vitae, an effect analysis and a risk analysis. Disciplinary appropriate indicators were identified and used to divide the researchers......This paper explores a 7-stage cluster methodology as a process to identify appropriate indicators for evaluation of individual researchers at a disciplinary and seniority level. Publication and citation data for 741 researchers from 4 disciplines was collected in Web of Science. Forty...... of statistics in research evaluation. The strength of the 7-stage cluster methodology is that it makes clear that in the evaluation of individual researchers, statistics cannot stand alone. The methodology is reliant on contextual information to verify the bibliometric values and cluster solution...

  15. Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics

    Directory of Open Access Journals (Sweden)

    R. Padraic Springuel

    2007-12-01

    Full Text Available We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and written elements. The primary goal of this paper is to describe the methodology itself; we include a brief overview of relevant results.

  16. A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

    OpenAIRE

    Li, Guohui; Zhang, Songling; Yang, Hong

    2017-01-01

    Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extreme-point symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed ...

  17. Statistical analysis of activation and reaction energies with quasi-variational coupled-cluster theory

    Science.gov (United States)

    Black, Joshua A.; Knowles, Peter J.

    2018-06-01

    The performance of quasi-variational coupled-cluster (QV) theory applied to the calculation of activation and reaction energies has been investigated. A statistical analysis of results obtained for six different sets of reactions has been carried out, and the results have been compared to those from standard single-reference methods. In general, the QV methods lead to increased activation energies and larger absolute reaction energies compared to those obtained with traditional coupled-cluster theory.

  18. Analysis and comparison of very large metagenomes with fast clustering and functional annotation

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2009-10-01

    Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from http://tools.camera.calit2.net/camera/rammcap/.

  19. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  20. Clustering analysis for muon tomography data elaboration in the Muon Portal project

    Science.gov (United States)

    Bandieramonte, M.; Antonuccio-Delogu, V.; Becciani, U.; Costa, A.; La Rocca, P.; Massimino, P.; Petta, C.; Pistagna, C.; Riggi, F.; Riggi, S.; Sciacca, E.; Vitello, F.

    2015-05-01

    Clustering analysis is one of multivariate data analysis techniques which allows to gather statistical data units into groups, in order to minimize the logical distance within each group and to maximize the one between different groups. In these proceedings, the authors present a novel approach to the muontomography data analysis based on clustering algorithms. As a case study we present the Muon Portal project that aims to build and operate a dedicated particle detector for the inspection of harbor containers to hinder the smuggling of nuclear materials. Clustering techniques, working directly on scattering points, help to detect the presence of suspicious items inside the container, acting, as it will be shown, as a filter for a preliminary analysis of the data.

  1. Profiling nurses' job satisfaction, acculturation, work environment, stress, cultural values and coping abilities: A cluster analysis.

    Science.gov (United States)

    Goh, Yong-Shian; Lee, Alice; Chan, Sally Wai-Chi; Chan, Moon Fai

    2015-08-01

    This study aimed to determine whether definable profiles existed in a cohort of nursing staff with regard to demographic characteristics, job satisfaction, acculturation, work environment, stress, cultural values and coping abilities. A survey was conducted in one hospital in Singapore from June to July 2012, and 814 full-time staff nurses completed a self-report questionnaire (89% response rate). Demographic characteristics, job satisfaction, acculturation, work environment, perceived stress, cultural values, ways of coping and intention to leave current workplace were assessed as outcomes. The two-step cluster analysis revealed three clusters. Nurses in cluster 1 (n = 222) had lower acculturation scores than nurses in cluster 3. Cluster 2 (n = 362) was a group of younger nurses who reported higher intention to leave (22.4%), stress level and job dissatisfaction than the other two clusters. Nurses in cluster 3 (n = 230) were mostly Singaporean and reported the lowest intention to leave (13.0%). Resources should be allocated to specifically address the needs of younger nurses and hopefully retain them in the profession. Management should focus their retention strategies on junior nurses and provide a work environment that helps to strengthen their intention to remain in nursing by increasing their job satisfaction. © 2014 Wiley Publishing Asia Pty Ltd.

  2. Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

    Directory of Open Access Journals (Sweden)

    M.F.M. Yunoh

    2015-06-01

    Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.

  3. Cluster analysis for the probability of DSB site induced by electron tracks

    Energy Technology Data Exchange (ETDEWEB)

    Yoshii, Y. [Biological Research, Education and Instrumentation Center, Sapporo Medical University, Sapporo 060-8556 (Japan); Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Sasaki, K. [Faculty of Health Sciences, Hokkaido University of Science, Sapporo 006-8585 (Japan); Matsuya, Y. [Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Date, H., E-mail: date@hs.hokudai.ac.jp [Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan)

    2015-05-01

    To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.

  4. TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data.

    Science.gov (United States)

    Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong

    2018-01-04

    Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Inverse Opal Photonic Crystals as an Optofluidic Platform for Fast Analysis of Hydrocarbon Mixtures.

    Science.gov (United States)

    Xu, Qiwei; Mahpeykar, Seyed Milad; Burgess, Ian B; Wang, Xihua

    2018-06-13

    Most of the reported optofluidic devices analyze liquid by measuring its refractive index. Recently, the wettability of liquid on various substrates has also been used as a key sensing parameter in optofluidic sensors. However, the above-mentioned techniques face challenges in the analysis of the relative concentration of components in an alkane hydrocarbon mixture, as both refractive indices and wettabilities of alkane hydrocarbons are very close. Here, we propose to apply volatility of liquid as the key sensing parameter, correlate it to the optical property of liquid inside inverse opal photonic crystals, and construct powerful optofluidic sensors for alkane hydrocarbon identification and analysis. We have demonstrated that via evaporation of hydrocarbons inside the per