platforms cluster analysis: Topics by WorldWideScience.org

Sample records for platforms cluster analysis

a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data

Science.gov (United States)

Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.

2017-09-01

Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.
Application of microarray analysis on computer cluster and cloud platforms.

Science.gov (United States)

Bernau, C; Boulesteix, A-L; Knaus, J

2013-01-01

Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
A novel model for Time-Series Data Clustering Based on piecewise SVD and BIRCH for Stock Data Analysis on Hadoop Platform

Directory of Open Access Journals (Sweden)

Ibgtc Bowala

2017-06-01

Full Text Available With the rapid growth of financial markets, analyzers are paying more attention on predictions. Stock data are time series data, with huge amounts. Feasible solution for handling the increasing amount of data is to use a cluster for parallel processing, and Hadoop parallel computing platform is a typical representative. There are various statistical models for forecasting time series data, but accurate clusters are a pre-requirement. Clustering analysis for time series data is one of the main methods for mining time series data for many other analysis processes. However, general clustering algorithms cannot perform clustering for time series data because series data has a special structure and a high dimensionality has highly co-related values due to high noise level. A novel model for time series clustering is presented using BIRCH, based on piecewise SVD, leading to a novel dimension reduction approach. Highly co-related features are handled using SVD with a novel approach for dimensionality reduction in order to keep co-related behavior optimal and then use BIRCH for clustering. The algorithm is a novel model that can handle massive time series data. Finally, this new model is successfully applied to real stock time series data of Yahoo finance with satisfactory results.
WebGimm: An integrated web-based platform for cluster analysis, functional analysis, and interactive visualization of results.

Science.gov (United States)

Joshi, Vineet K; Freudenberg, Johannes M; Hu, Zhen; Medvedovic, Mario

2011-01-17

Cluster analysis methods have been extensively researched, but the adoption of new methods is often hindered by technical barriers in their implementation and use. WebGimm is a free cluster analysis web-service, and an open source general purpose clustering web-server infrastructure designed to facilitate easy deployment of integrated cluster analysis servers based on clustering and functional annotation algorithms implemented in R. Integrated functional analyses and interactive browsing of both, clustering structure and functional annotations provides a complete analytical environment for cluster analysis and interpretation of results. The Java Web Start client-based interface is modeled after the familiar cluster/treeview packages making its use intuitive to a wide array of biomedical researchers. For biomedical researchers, WebGimm provides an avenue to access state of the art clustering procedures. For Bioinformatics methods developers, WebGimm offers a convenient avenue to deploy their newly developed clustering methods. WebGimm server, software and manuals can be freely accessed at http://ClusterAnalysis.org/.
Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

Science.gov (United States)

Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

2005-05-01

The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.
Extracting Synthetic Multi-Cluster Platform Configurations from Grid'5000 for Driving Simulation Experiments

OpenAIRE

Suter , Frédéric; Casanova , Henri

2007-01-01

This report presents a collection of synthetic but realistic distributed computing platform configurations. These configurations are intended for simulation experiments in the study of parallel applications on multi-cluster platforms.
Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

Directory of Open Access Journals (Sweden)

Jyh-Da Wei

2017-08-01

Full Text Available High-end graphics processing units (GPUs, such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1, which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs. Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform. Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.
Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments.

Science.gov (United States)

Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu

2017-01-01

High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.
MetaABC--an integrated metagenomics platform for data adjustment, binning and clustering.

Science.gov (United States)

Su, Chien-Hao; Hsu, Ming-Tsung; Wang, Tse-Yi; Chiang, Sufeng; Cheng, Jen-Hao; Weng, Francis C; Kao, Cheng-Yan; Wang, Daryi; Tsai, Huai-Kuang

2011-08-15

MetaABC is a metagenomic platform that integrates several binning tools coupled with methods for removing artifacts, analyzing unassigned reads and controlling sampling biases. It allows users to arrive at a better interpretation via series of distinct combinations of analysis tools. After execution, MetaABC provides outputs in various visual formats such as tables, pie and bar charts as well as clustering result diagrams. MetaABC source code and documentation are available at http://bits2.iis.sinica.edu.tw/MetaABC/ CONTACT: dywang@gate.sinica.edu.tw; hktsai@iis.sinica.edu.tw Supplementary data are available at Bioinformatics online.
Scientific data analysis on data-parallel platforms.

Energy Technology Data Exchange (ETDEWEB)

Ulmer, Craig D.; Bayer, Gregory W.; Choe, Yung Ryn; Roe, Diana C.

2010-09-01

As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated analysis algorithms in the computing platforms' storage systems. Data Warehouse Appliances (DWAs) are attractive for this work, due to their ability to store and process massive datasets efficiently. While DWAs have been utilized effectively in data-mining and informatics applications, they remain largely unproven in scientific workloads. In this paper we present our experiences in adapting two mesh analysis algorithms to function on five different DWA architectures: two Netezza database appliances, an XtremeData dbX database, a LexisNexis DAS, and multiple Hadoop MapReduce clusters. The main contribution of this work is insight into the differences between these DWAs from a user's perspective. In addition, we present performance measurements for ten DWA systems to help understand the impact of different architectural trade-offs in these systems.
VR-Cluster: Dynamic Migration for Resource Fragmentation Problem in Virtual Router Platform

Directory of Open Access Journals (Sweden)

Xianming Gao

2016-01-01

Full Text Available Network virtualization technology is regarded as one of gradual schemes to network architecture evolution. With the development of network functions virtualization, operators make lots of effort to achieve router virtualization by using general servers. In order to ensure high performance, virtual router platform usually adopts a cluster of general servers, which can be also regarded as a special cloud computing environment. However, due to frequent creation and deletion of router instances, it may generate lots of resource fragmentation to prevent platform from establishing new router instances. In order to solve “resource fragmentation problem,” we firstly propose VR-Cluster, which introduces two extra function planes including switching plane and resource management plane. Switching plane is mainly used to support seamless migration of router instances without packet loss; resource management plane can dynamically move router instances from one server to another server by using VR-mapping algorithms. Besides, three VR-mapping algorithms including first-fit mapping algorithm, best-fit mapping algorithm, and worst-fit mapping algorithm are proposed based on VR-Cluster. At last, we establish VR-Cluster protosystem by using general X86 servers, evaluate its migration time, and further analyze advantages and disadvantages of our proposed VR-mapping algorithms to solve resource fragmentation problem.
MASPECTRAS: a platform for management and analysis of proteomics LC-MS/MS data

Directory of Open Access Journals (Sweden)

Rader Robert

2007-06-01

Full Text Available Abstract Background The advancements of proteomics technologies have led to a rapid increase in the number, size and rate at which datasets are generated. Managing and extracting valuable information from such datasets requires the use of data management platforms and computational approaches. Results We have developed the MAss SPECTRometry Analysis System (MASPECTRAS, a platform for management and analysis of proteomics LC-MS/MS data. MASPECTRAS is based on the Proteome Experimental Data Repository (PEDRo relational database schema and follows the guidelines of the Proteomics Standards Initiative (PSI. Analysis modules include: 1 import and parsing of the results from the search engines SEQUEST, Mascot, Spectrum Mill, X! Tandem, and OMSSA; 2 peptide validation, 3 clustering of proteins based on Markov Clustering and multiple alignments; and 4 quantification using the Automated Statistical Analysis of Protein Abundance Ratios algorithm (ASAPRatio. The system provides customizable data retrieval and visualization tools, as well as export to PRoteomics IDEntifications public repository (PRIDE. MASPECTRAS is freely available at http://genome.tugraz.at/maspectras Conclusion Given the unique features and the flexibility due to the use of standard software technology, our platform represents significant advance and could be of great interest to the proteomics community.
ClusterCAD: a computational platform for type I modular polyketide synthase design

DEFF Research Database (Denmark)

Eng, Clara H.; Backman, Tyler W. H.; Bailey, Constance B.

2018-01-01

barrier to the design of active variants, and identifying strategies to reliably construct functional PKS chimeras remains an active area of research. In this work, we formalize a paradigm for the design of PKS chimeras and introduce ClusterCAD as a computational platform to streamline and simplify...
Cost (non)-recovery by platform technology facilities in the Bio21 Cluster.

Science.gov (United States)

Gibbs, Gerard; Clark, Stella; Quinn, Julieanne; Gleeson, Mary Joy

2010-04-01

Platform technologies (PT) are techniques or tools that enable a range of scientific investigations and are critical to today's advanced technology research environment. Once installed, they require specialized staff for their operations, who in turn, provide expertise to researchers in designing appropriate experiments. Through this pipeline, research outputs are raised to the benefit of the researcher and the host institution. Platform facilities provide access to instrumentation and expertise for a wide range of users beyond the host institution, including other academic and industry users. To maximize the return on these substantial public investments, this wider access needs to be supported. The question of support and the mechanisms through which this occurs need to be established based on a greater understanding of how PT facilities operate. This investigation was aimed at understanding if and how platform facilities across the Bio21 Cluster meet operating costs. Our investigation found: 74% of platforms surveyed do not recover 100% of direct operating costs and are heavily subsidized by their home institution, which has a vested interest in maintaining the technology platform; platform managers play a major role in establishing the costs and pricing of the facility, normally in a collaborative process with a management committee or institutional accountant; and most facilities have a three-tier pricing structure recognizing internal academic, external academic, and commercial clients.
Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach.

Science.gov (United States)

Liang, Muxuan; Li, Zhizhong; Chen, Ting; Zeng, Jianyang

2015-01-01

Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for
Cluster analysis for applications

CERN Document Server

Anderberg, Michael R

1973-01-01

Cluster Analysis for Applications deals with methods and various applications of cluster analysis. Topics covered range from variables and scales to measures of association among variables and among data units. Conceptual problems in cluster analysis are discussed, along with hierarchical and non-hierarchical clustering methods. The necessary elements of data analysis, statistics, cluster analysis, and computer implementation are integrated vertically to cover the complete path from raw data to a finished analysis.Comprised of 10 chapters, this book begins with an introduction to the subject o
PIIKA 2: an expanded, web-based platform for analysis of kinome microarray data.

Directory of Open Access Journals (Sweden)

Brett Trost

Full Text Available Kinome microarrays are comprised of peptides that act as phosphorylation targets for protein kinases. This platform is growing in popularity due to its ability to measure phosphorylation-mediated cellular signaling in a high-throughput manner. While software for analyzing data from DNA microarrays has also been used for kinome arrays, differences between the two technologies and associated biologies previously led us to develop Platform for Intelligent, Integrated Kinome Analysis (PIIKA, a software tool customized for the analysis of data from kinome arrays. Here, we report the development of PIIKA 2, a significantly improved version with new features and improvements in the areas of clustering, statistical analysis, and data visualization. Among other additions to the original PIIKA, PIIKA 2 now allows the user to: evaluate statistically how well groups of samples cluster together; identify sets of peptides that have consistent phosphorylation patterns among groups of samples; perform hierarchical clustering analysis with bootstrapping; view false negative probabilities and positive and negative predictive values for t-tests between pairs of samples; easily assess experimental reproducibility; and visualize the data using volcano plots, scatterplots, and interactive three-dimensional principal component analyses. Also new in PIIKA 2 is a web-based interface, which allows users unfamiliar with command-line tools to easily provide input and download the results. Collectively, the additions and improvements described here enhance both the breadth and depth of analyses available, simplify the user interface, and make the software an even more valuable tool for the analysis of kinome microarray data. Both the web-based and stand-alone versions of PIIKA 2 can be accessed via http://saphire.usask.ca.
PIIKA 2: an expanded, web-based platform for analysis of kinome microarray data.

Science.gov (United States)

Trost, Brett; Kindrachuk, Jason; Määttänen, Pekka; Napper, Scott; Kusalik, Anthony

2013-01-01

Kinome microarrays are comprised of peptides that act as phosphorylation targets for protein kinases. This platform is growing in popularity due to its ability to measure phosphorylation-mediated cellular signaling in a high-throughput manner. While software for analyzing data from DNA microarrays has also been used for kinome arrays, differences between the two technologies and associated biologies previously led us to develop Platform for Intelligent, Integrated Kinome Analysis (PIIKA), a software tool customized for the analysis of data from kinome arrays. Here, we report the development of PIIKA 2, a significantly improved version with new features and improvements in the areas of clustering, statistical analysis, and data visualization. Among other additions to the original PIIKA, PIIKA 2 now allows the user to: evaluate statistically how well groups of samples cluster together; identify sets of peptides that have consistent phosphorylation patterns among groups of samples; perform hierarchical clustering analysis with bootstrapping; view false negative probabilities and positive and negative predictive values for t-tests between pairs of samples; easily assess experimental reproducibility; and visualize the data using volcano plots, scatterplots, and interactive three-dimensional principal component analyses. Also new in PIIKA 2 is a web-based interface, which allows users unfamiliar with command-line tools to easily provide input and download the results. Collectively, the additions and improvements described here enhance both the breadth and depth of analyses available, simplify the user interface, and make the software an even more valuable tool for the analysis of kinome microarray data. Both the web-based and stand-alone versions of PIIKA 2 can be accessed via http://saphire.usask.ca.
A high performance image processing platform based on CPU-GPU heterogeneous cluster with parallel image reconstroctions for micro-CT

International Nuclear Information System (INIS)

Ding Yu; Qi Yujin; Zhang Xuezhu; Zhao Cuilan

2011-01-01

In this paper, we report the development of a high-performance image processing platform, which is based on CPU-GPU heterogeneous cluster. Currently, it consists of a Dell Precision T7500 and HP XW8600 workstations with parallel programming and runtime environment, using the message-passing interface (MPI) and CUDA (Compute Unified Device Architecture). We succeeded in developing parallel image processing techniques for 3D image reconstruction of X-ray micro-CT imaging. The results show that a GPU provides a computing efficiency of about 194 times faster than a single CPU, and the CPU-GPU clusters provides a computing efficiency of about 46 times faster than the CPU clusters. These meet the requirements of rapid 3D image reconstruction and real time image display. In conclusion, the use of CPU-GPU heterogeneous cluster is an effective way to build high-performance image processing platform. (authors)
SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.

Science.gov (United States)

Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A

2018-01-01

Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.

Clustering analysis

International Nuclear Information System (INIS)

Romli

1997-01-01

Cluster analysis is the name of group of multivariate techniques whose principal purpose is to distinguish similar entities from the characteristics they process.To study this analysis, there are several algorithms that can be used. Therefore, this topic focuses to discuss the algorithms, such as, similarity measures, and hierarchical clustering which includes single linkage, complete linkage and average linkage method. also, non-hierarchical clustering method, which is popular name K -mean method ' will be discussed. Finally, this paper will be described the advantages and disadvantages of every methods
Cluster analysis

CERN Document Server

Everitt, Brian S; Leese, Morven; Stahl, Daniel

2011-01-01

Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics.This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data.Real life examples are used throughout to demons
GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

DEFF Research Database (Denmark)

Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

2011-01-01

-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical...... displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics...
Internet of Things-Based Arduino Intelligent Monitoring and Cluster Analysis of Seasonal Variation in Physicochemical Parameters of Jungnangcheon, an Urban Stream

Directory of Open Access Journals (Sweden)

Byungwan Jo

2017-03-01

Full Text Available In the present case study, the use of an advanced, efficient and low-cost technique for monitoring an urban stream was reported. Physicochemical parameters (PcPs of Jungnangcheon stream (Seoul, South Korea were assessed using an Internet of Things (IoT platform. Temperature, dissolved oxygen (DO, and pH parameters were monitored for the three summer months and the first fall month at a fixed location. Analysis was performed using clustering techniques (CTs, such as K-means clustering, agglomerative hierarchical clustering (AHC, and density-based spatial clustering of applications with noise (DBSCAN. An IoT-based Arduino sensor module (ASM network with a 99.99% efficient communication platform was developed to allow collection of stream data with user-friendly software and hardware and facilitated data analysis by interested individuals using their smartphones. Clustering was used to formulate relationships among physicochemical parameters. K-means clustering was used to identify natural clusters using the silhouette coefficient based on cluster compactness and looseness. AHC grouped all data into two clusters as well as temperature, DO and pH into four, eight, and four clusters, respectively. DBSCAN analysis was also performed to evaluate yearly variations in physicochemical parameters. Noise points (NOISE of temperature in 2016 were border points (ƥ, whereas in 2014 and 2015 they remained core points (ɋ, indicating a trend toward increasing stream temperature. We found the stream parameters were within the permissible limits set by the Water Quality Standards for River Water, South Korea.
Marketing research cluster analysis

OpenAIRE

Marić Nebojša

2002-01-01

One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
Marketing research cluster analysis

Directory of Open Access Journals (Sweden)

Marić Nebojša

2002-01-01

Full Text Available One area of applications of cluster analysis in marketing is identification of groups of cities and towns with similar demographic profiles. This paper considers main aspects of cluster analysis by an example of clustering 12 cities with the use of Minitab software.
VAAPA: a web platform for visualization and analysis of alternative polyadenylation.

Science.gov (United States)

Guan, Jinting; Fu, Jingyi; Wu, Mingcheng; Chen, Longteng; Ji, Guoli; Quinn Li, Qingshun; Wu, Xiaohui

2015-02-01

Polyadenylation [poly(A)] is an essential process during the maturation of most mRNAs in eukaryotes. Alternative polyadenylation (APA) as an important layer of gene expression regulation has been increasingly recognized in various species. Here, a web platform for visualization and analysis of alternative polyadenylation (VAAPA) was developed. This platform can visualize the distribution of poly(A) sites and poly(A) clusters of a gene or a section of a chromosome. It can also highlight genes with switched APA sites among different conditions. VAAPA is an easy-to-use web-based tool that provides functions of poly(A) site query, data uploading, downloading, and APA sites visualization. It was designed in a multi-tier architecture and developed based on Smart GWT (Google Web Toolkit) using Java as the development language. VAAPA will be a valuable addition to the community for the comprehensive study of APA, not only by making the high quality poly(A) site data more accessible, but also by providing users with numerous valuable functions for poly(A) site analysis and visualization. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comprehensive cluster analysis with Transitivity Clustering.

Science.gov (United States)

Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

2011-03-01

Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Trading strategies in the overnight money market: Correlations and clustering on the e-MID trading platform

Science.gov (United States)

Fricke, Daniel

2012-12-01

We analyze the correlations in patterns of trading for members of the Italian interbank trading platform e-MID. The trading strategy of a particular member institution is defined as the sequence of (intra-) daily net trading volumes within a certain semester. Based on this definition, we show that there are significant and persistent bilateral correlations between institutions’ trading strategies. In most semesters we find two clusters, with positively (negatively) correlated trading strategies within (between) clusters. We show that the two clusters mostly contain continuous net buyers and net sellers of money, respectively, and that cluster memberships of individual banks are highly persistent. Additionally, we highlight some problems related to our definition of trading strategies. Our findings add further evidence on the fact that preferential lending relationships on the micro-level lead to community structure on the macro-level.
[Cluster analysis in biomedical researches].

Science.gov (United States)

Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

2013-01-01

Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.
Parallel application of plasma equilibrium fitting based on inhomogeneous platforms

International Nuclear Information System (INIS)

Liao Min; Zhang Jinhua; Chen Liaoyuan; Li Yongge; Pan Wei; Pan Li

2008-01-01

An online analysis and online display platform EFIT, which is based on the equilibrium-fitting mode, is inducted in this paper. This application can realize large data transportation between inhomogeneous platforms by designing a communication mechanism using sockets. It spends approximately one minute to complete the equilibrium fitting reconstruction by using a finite state machine to describe the management node and several node computers of cluster system to fulfill the parallel computation, this satisfies the online display during the discharge interval. An effective communication model between inhomogeneous platforms is provided, which could transport the computing results from Linux platform to Windows platform for online analysis and display. (authors)
Platform computing

CERN Multimedia

2002-01-01

"Platform Computing releases first grid-enabled workload management solution for IBM eServer Intel and UNIX high performance computing clusters. This Out-of-the-box solution maximizes the performance and capability of applications on IBM HPC clusters" (1/2 page) .
ANALYSIS OF FACTORS INFLUENCING TRAVEL CONSUMER SATISFACTION AS REVEALED BY ONLINE COMMUNICATION PLATFORMS

Directory of Open Access Journals (Sweden)

Olimpia I. BAN

2016-08-01

Full Text Available The objective of the present empirical study is to determine the factors influencing the tourism consumer satisfaction, as it results from the evaluations posted on virtual platforms. The communication platform chosen as study is the Romanian website Amfostacolo.ro. In this case, the travel consumer satisfaction is expressed by the score of the ratings posted on the virtual platform Amfostacolo.ro and the decision to recommend or not the unit / destination. Considering the peculiarities of the communication platform studied, the elements influencing the score indicating satisfaction there can be identified as components of tourism supply and the characteristics of the reviewer. Data processing has been carried out with ordinary least squares (OLS, structural equation modeling (confirmatory factor analysis, path analysis, cluster analisys and polytomous logistic regression. The results broadly confirm the hypotheses, namely that: the type of stay and the age of the reviewer influence the satisfaction of the consumer more than the destination and number of stars of the accommodation, the age group of the reviewer influences the destination yet it is uncertain about the influence of the variables related to the holiday (the type of stay and the number of stars of the accommodation, the meal service influences more than other attributes the consumer satisfaction and the recommendation of the reviewer is influenced by the characteristics related to his person and the holiday consumed.
MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

Science.gov (United States)

Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

2018-01-01

MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.
3D Viewer Platform of Cloud Clustering Management System: Google Map 3D

Science.gov (United States)

Choi, Sung-Ja; Lee, Gang-Soo

The new management system of framework for cloud envrionemnt is needed by the platfrom of convergence according to computing environments of changes. A ISV and small business model is hard to adapt management system of platform which is offered from super business. This article suggest the clustering management system of cloud computing envirionments for ISV and a man of enterprise in small business model. It applies the 3D viewer adapt from map3D & earth of google. It is called 3DV_CCMS as expand the CCMS[1].
Cysteine as a ligand platform in the biosynthesis of the FeFe hydrogenase H cluster.

Science.gov (United States)

Suess, Daniel L M; Bürstel, Ingmar; De La Paz, Liliana; Kuchenreuther, Jon M; Pham, Cindy C; Cramer, Stephen P; Swartz, James R; Britt, R David

2015-09-15

Hydrogenases catalyze the redox interconversion of protons and H2, an important reaction for a number of metabolic processes and for solar fuel production. In FeFe hydrogenases, catalysis occurs at the H cluster, a metallocofactor comprising a [4Fe-4S]H subcluster coupled to a [2Fe]H subcluster bound by CO, CN(-), and azadithiolate ligands. The [2Fe]H subcluster is assembled by the maturases HydE, HydF, and HydG. HydG is a member of the radical S-adenosyl-L-methionine family of enzymes that transforms Fe and L-tyrosine into an [Fe(CO)2(CN)] synthon that is incorporated into the H cluster. Although it is thought that the site of synthon formation in HydG is the "dangler" Fe of a [5Fe] cluster, many mechanistic aspects of this chemistry remain unresolved including the full ligand set of the synthon, how the dangler Fe initially binds to HydG, and how the synthon is released at the end of the reaction. To address these questions, we herein show that L-cysteine (Cys) binds the auxiliary [4Fe-4S] cluster of HydG and further chelates the dangler Fe. We also demonstrate that a [4Fe-4S]aux[CN] species is generated during HydG catalysis, a process that entails the loss of Cys and the [Fe(CO)2(CN)] fragment; on this basis, we suggest that Cys likely completes the coordination sphere of the synthon. Thus, through spectroscopic analysis of HydG before and after the synthon is formed, we conclude that Cys serves as the ligand platform on which the synthon is built and plays a role in both Fe(2+) binding and synthon release.
Shot loading platform analysis

International Nuclear Information System (INIS)

Norman, B.F.

1994-01-01

This document provides the wind/seismic analysis and evaluation for the shot loading platform. Hand calculations were used for the analysis. AISC and UBC load factors were used in this evaluation. The results show that the actual loads are under the allowable loads and all requirements are met
Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

Science.gov (United States)

2014-01-01

Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations
Integrative cluster analysis in bioinformatics

CERN Document Server

Abu-Jamous, Basel; Nandi, Asoke K

2015-01-01

Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy.

Science.gov (United States)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-19

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3D-MIP platform when a larger number of cores is available.

Dielectric spectroscopy platform to measure MCF10A epithelial cell aggregation as a model for spheroidal cell cluster analysis.

Science.gov (United States)

Heileman, K L; Tabrizian, M

2017-05-02

3-Dimensional cell cultures are more representative of the native environment than traditional cell cultures on flat substrates. As a result, 3-dimensional cell cultures have emerged as a very valuable model environment to study tumorigenesis, organogenesis and tissue regeneration. Many of these models encompass the formation of cell aggregates, which mimic the architecture of tumor and organ tissue. Dielectric impedance spectroscopy is a non-invasive, label free and real time technique, overcoming the drawbacks of established techniques to monitor cell aggregates. Here we introduce a platform to monitor cell aggregation in a 3-dimensional extracellular matrix using dielectric spectroscopy. The MCF10A breast epithelial cell line serves as a model for cell aggregation. The platform maintains sterile conditions during the multi-day assay while allowing continuous dielectric spectroscopy measurements. The platform geometry optimizes dielectric measurements by concentrating cells within the electrode sensing region. The cells show a characteristic dielectric response to aggregation which corroborates with finite element analysis computer simulations. By fitting the experimental dielectric spectra to the Cole-Cole equation, we demonstrated that the dispersion intensity Δε and the characteristic frequency f c are related to cell aggregate growth. In addition, microscopy can be performed directly on the platform providing information about cell position, density and morphology. This platform could yield many applications for studying the electrophysiological activity of cell aggregates.
Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles

Science.gov (United States)

Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny

2017-10-01

Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.
MycoCAP - Mycobacterium Comparative Analysis Platform.

Science.gov (United States)

Choo, Siew Woh; Ang, Mia Yang; Dutta, Avirup; Tan, Shi Yang; Siow, Cheuk Chuen; Heydari, Hamed; Mutha, Naresh V R; Wee, Wei Yee; Wong, Guat Jah

2015-12-15

Mycobacterium spp. are renowned for being the causative agent of diseases like leprosy, Buruli ulcer and tuberculosis in human beings. With more and more mycobacterial genomes being sequenced, any knowledge generated from comparative genomic analysis would provide better insights into the biology, evolution, phylogeny and pathogenicity of this genus, thus helping in better management of diseases caused by Mycobacterium spp.With this motivation, we constructed MycoCAP, a new comparative analysis platform dedicated to the important genus Mycobacterium. This platform currently provides information of 2108 genome sequences of at least 55 Mycobacterium spp. A number of intuitive web-based tools have been integrated in MycoCAP particularly for comparative analysis including the PGC tool for comparison between two genomes, PathoProT for comparing the virulence genes among the Mycobacterium strains and the SuperClassification tool for the phylogenic classification of the Mycobacterium strains and a specialized classification system for strains of Mycobacterium abscessus. We hope the broad range of functions and easy-to-use tools provided in MycoCAP makes it an invaluable analysis platform to speed up the research discovery on mycobacteria for researchers. Database URL: http://mycobacterium.um.edu.my.
Cluster analysis in phenotyping a Portuguese population.

Science.gov (United States)

Loureiro, C C; Sa-Couto, P; Todo-Bom, A; Bousquet, J

2015-09-03

Unbiased cluster analysis using clinical parameters has identified asthma phenotypes. Adding inflammatory biomarkers to this analysis provided a better insight into the disease mechanisms. This approach has not yet been applied to asthmatic Portuguese patients. To identify phenotypes of asthma using cluster analysis in a Portuguese asthmatic population treated in secondary medical care. Consecutive patients with asthma were recruited from the outpatient clinic. Patients were optimally treated according to GINA guidelines and enrolled in the study. Procedures were performed according to a standard evaluation of asthma. Phenotypes were identified by cluster analysis using Ward's clustering method. Of the 72 patients enrolled, 57 had full data and were included for cluster analysis. Distribution was set in 5 clusters described as follows: cluster (C) 1, early onset mild allergic asthma; C2, moderate allergic asthma, with long evolution, female prevalence and mixed inflammation; C3, allergic brittle asthma in young females with early disease onset and no evidence of inflammation; C4, severe asthma in obese females with late disease onset, highly symptomatic despite low Th2 inflammation; C5, severe asthma with chronic airflow obstruction, late disease onset and eosinophilic inflammation. In our study population, the identified clusters were mainly coincident with other larger-scale cluster analysis. Variables such as age at disease onset, obesity, lung function, FeNO (Th2 biomarker) and disease severity were important for cluster distinction. Copyright © 2015. Published by Elsevier España, S.L.U.
Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

Directory of Open Access Journals (Sweden)

Craig A Gedye

Full Text Available Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell
Cell surface profiling using high-throughput flow cytometry: a platform for biomarker discovery and analysis of cellular heterogeneity.

Science.gov (United States)

Gedye, Craig A; Hussain, Ali; Paterson, Joshua; Smrke, Alannah; Saini, Harleen; Sirskyj, Danylo; Pereira, Keira; Lobo, Nazleen; Stewart, Jocelyn; Go, Christopher; Ho, Jenny; Medrano, Mauricio; Hyatt, Elzbieta; Yuan, Julie; Lauriault, Stevan; Meyer, Mona; Kondratyev, Maria; van den Beucken, Twan; Jewett, Michael; Dirks, Peter; Guidos, Cynthia J; Danska, Jayne; Wang, Jean; Wouters, Bradly; Neel, Benjamin; Rottapel, Robert; Ailles, Laurie E

2014-01-01

Cell surface proteins have a wide range of biological functions, and are often used as lineage-specific markers. Antibodies that recognize cell surface antigens are widely used as research tools, diagnostic markers, and even therapeutic agents. The ability to obtain broad cell surface protein profiles would thus be of great value in a wide range of fields. There are however currently few available methods for high-throughput analysis of large numbers of cell surface proteins. We describe here a high-throughput flow cytometry (HT-FC) platform for rapid analysis of 363 cell surface antigens. Here we demonstrate that HT-FC provides reproducible results, and use the platform to identify cell surface antigens that are influenced by common cell preparation methods. We show that multiple populations within complex samples such as primary tumors can be simultaneously analyzed by co-staining of cells with lineage-specific antibodies, allowing unprecedented depth of analysis of heterogeneous cell populations. Furthermore, standard informatics methods can be used to visualize, cluster and downsample HT-FC data to reveal novel signatures and biomarkers. We show that the cell surface profile provides sufficient molecular information to classify samples from different cancers and tissue types into biologically relevant clusters using unsupervised hierarchical clustering. Finally, we describe the identification of a candidate lineage marker and its subsequent validation. In summary, HT-FC combines the advantages of a high-throughput screen with a detection method that is sensitive, quantitative, highly reproducible, and allows in-depth analysis of heterogeneous samples. The use of commercially available antibodies means that high quality reagents are immediately available for follow-up studies. HT-FC has a wide range of applications, including biomarker discovery, molecular classification of cancers, or identification of novel lineage specific or stem cell markers.
ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

Science.gov (United States)

Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

2016-01-01

Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.
GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

Directory of Open Access Journals (Sweden)

Raquel L. Costa

2017-07-01

Full Text Available There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were
TreeCluster: Massively scalable transmission clustering using phylogenetic trees

OpenAIRE

Moshiri, Alexander

2018-01-01

Background: The ability to infer transmission clusters from molecular data is critical to designing and evaluating viral control strategies. Viral sequencing datasets are growing rapidly, but standard methods of transmission cluster inference do not scale well beyond thousands of sequences. Results: I present TreeCluster, a cross-platform tool that performs transmission cluster inference on a given phylogenetic tree orders of magnitude faster than existing inference methods and supports multi...
ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data.

Science.gov (United States)

Gardeux, Vincent; David, Fabrice P A; Shajkofci, Adrian; Schwalie, Petra C; Deplancke, Bart

2017-10-01

Single-cell RNA-sequencing (scRNA-seq) allows whole transcriptome profiling of thousands of individual cells, enabling the molecular exploration of tissues at the cellular level. Such analytical capacity is of great interest to many research groups in the world, yet these groups often lack the expertise to handle complex scRNA-seq datasets. We developed a fully integrated, web-based platform aimed at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. This Automated Single-cell Analysis Pipeline (ASAP) combines a wide range of commonly used algorithms with sophisticated visualization tools. Compared with existing scRNA-seq analysis platforms, researchers (including those lacking computational expertise) are able to interact with the data in a straightforward fashion and in real time. Furthermore, given the overlap between scRNA-seq and bulk RNA-seq analysis workflows, ASAP should conceptually be broadly applicable to any RNA-seq dataset. As a validation, we demonstrate how we can use ASAP to simply reproduce the results from a single-cell study of 91 mouse cells involving five distinct cell types. The tool is freely available at asap.epfl.ch and R/Python scripts are available at github.com/DeplanckeLab/ASAP. bart.deplancke@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sreepathi, Sarat [ORNL; Kumar, Jitendra [ORNL; Mills, Richard T. [Argonne National Laboratory; Hoffman, Forrest M. [ORNL; Sripathi, Vamsi [Intel Corporation; Hargrove, William Walter [United States Department of Agriculture (USDA), United States Forest Service (USFS)

2017-09-01

A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like the Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.
Multiscale deep drawing analysis of dual-phase steels using grain cluster-based RGC scheme

International Nuclear Information System (INIS)

Tjahjanto, D D; Eisenlohr, P; Roters, F

2015-01-01

Multiscale modelling and simulation play an important role in sheet metal forming analysis, since the overall material responses at macroscopic engineering scales, e.g. formability and anisotropy, are strongly influenced by microstructural properties, such as grain size and crystal orientations (texture). In the present report, multiscale analysis on deep drawing of dual-phase steels is performed using an efficient grain cluster-based homogenization scheme.The homogenization scheme, called relaxed grain cluster (RGC), is based on a generalization of the grain cluster concept, where a (representative) volume element consists of p × q × r (hexahedral) grains. In this scheme, variation of the strain or deformation of individual grains is taken into account through the, so-called, interface relaxation, which is formulated within an energy minimization framework. An interfacial penalty term is introduced into the energy minimization framework in order to account for the effects of grain boundaries.The grain cluster-based homogenization scheme has been implemented and incorporated into the advanced material simulation platform DAMASK, which purposes to bridge the macroscale boundary value problems associated with deep drawing analysis to the micromechanical constitutive law, e.g. crystal plasticity model. Standard Lankford anisotropy tests are performed to validate the model parameters prior to the deep drawing analysis. Model predictions for the deep drawing simulations are analyzed and compared to the corresponding experimental data. The result shows that the predictions of the model are in a very good agreement with the experimental measurement. (paper)
Analysis of offshore platforms lifting with fixed pile structure type (fixed platform) based on ASD89

Science.gov (United States)

Sugianto, Agus; Indriani, Andi Marini

2017-11-01

Platform construction GTS (Gathering Testing Sattelite) is offshore construction platform with fix pile structure type/fixed platform functioning to support the mining of petroleum exploitation. After construction fabrication process platform was moved to barges, then shipped to the installation site. Moving process is generally done by pull or push based on construction design determined when planning. But at the time of lifting equipment/cranes available in the work area then the moving process can be done by lifting so that moving activity can be implemented more quickly of work. This analysis moving process of GTS platform in a different way that is generally done to GTS platform types by lifting using problem is construction reinforcement required, so the construction can be moved by lifting with analyzing and checking structure working stress that occurs due to construction moving process by lifting AISC code standard and analysis using the SAP2000 structure analysis program. The analysis result showed that existing condition cannot be moved by lifting because stress ratio is above maximum allowable value that is 0.950 (AISC-ASD89). Overstress occurs on the member 295 and 324 with stress ratio value 0.97 and 0.95 so that it is required structural reinforcement. Box plate aplication at both members so that it produces stress ratio values 0.78 at the member 295 and stress ratio of 0.77 at the member 324. These results indicate that the construction have qualified structural reinforcement for being moved by lifting.
Targeted capture and heterologous expression of the Pseudoalteromonas alterochromide gene cluster in Escherichia coli represents a promising natural product exploratory platform.

Science.gov (United States)

Ross, Avena C; Gulland, Lauren E S; Dorrestein, Pieter C; Moore, Bradley S

2015-04-17

Marine pseudoalteromonads represent a very promising source of biologically important natural product molecules. To access and exploit the full chemical capacity of these cosmopolitan Gram-(-) bacteria, we sought to apply universal synthetic biology tools to capture, refactor, and express biosynthetic gene clusters for the production of complex organic compounds in reliable host organisms. Here, we report a platform for the capture of proteobacterial gene clusters using a transformation-associated recombination (TAR) strategy coupled with direct pathway manipulation and expression in Escherichia coli. The ~34 kb pathway for production of alterochromide lipopeptides by Pseudoalteromonas piscicida JCM 20779 was captured and heterologously expressed in E. coli utilizing native and E. coli-based T7 promoter sequences. Our approach enabled both facile production of the alterochromides and in vivo interrogation of gene function associated with alterochromide's unusual brominated lipid side chain. This platform represents a simple but effective strategy for the discovery and biosynthetic characterization of natural products from marine proteobacteria.
Parallel single-cell analysis microfluidic platform

NARCIS (Netherlands)

van den Brink, Floris Teunis Gerardus; Gool, Elmar; Frimat, Jean-Philippe; Bomer, Johan G.; van den Berg, Albert; le Gac, Severine

2011-01-01

We report a PDMS microfluidic platform for parallel single-cell analysis (PaSCAl) as a powerful tool to decipher the heterogeneity found in cell populations. Cells are trapped individually in dedicated pockets, and thereafter, a number of invasive or non-invasive analysis schemes are performed.
2009 Analysis Platform Review Report

Energy Technology Data Exchange (ETDEWEB)

Ferrell, John [Office of Energy Efficiency and Renewable Energy (EERE), Washington, DC (United States

2009-12-01

This document summarizes the recommendations and evaluations provided by an independent external panel of experts at the U.S. Department of Energy Biomass Program’s Analysis platform review meeting, held on February 18, 2009, at the Marriott Residence Inn, National Harbor, Maryland.
High repeatability from 3D experimental platform for quantitative analysis of cellular branch pattern formations.

Science.gov (United States)

Hagiwara, Masaya; Nobata, Rina; Kawahara, Tomohiro

2018-04-24

Three-dimensional (3D) cell and tissue cultures more closely mimic biological environments than two-dimensional (2D) cultures and are therefore highly desirable in culture experiments. However, 3D cultures often fail to yield repeatable experimental results because of variation in the initial culture conditions, such as cell density and distribution in the extracellular matrix, and therefore reducing such variation is a paramount concern. Here, we present a 3D culture platform that demonstrates highly repeatable experimental results, obtained by controlling the initial cell cluster shape in the gel cube culture device. A micro-mould with the desired shape was fabricated by photolithography or machining, creating a 3D pocket in the extracellular matrix contained in the device. Highly concentrated human bronchial epithelial cells were then injected in the pocket so that the cell cluster shape matched the fabricated mould shape. Subsequently, the cubic device supplied multi-directional scanning, enabling high-resolution capture of the whole tissue structure with only a low-magnification lens. The proposed device significantly improved the repeatability of the developed branch pattern, and multi-directional scanning enabled quantitative analysis of the developed branch pattern formations. A mathematical simulation was also conducted to reveal the mechanisms of branch pattern formation. The proposed platform offers the potential to accelerate any research field that conducts 3D culture experiments, including tissue regeneration and drug development.
Facies analysis of an Upper Jurassic carbonate platform for geothermal reservoir characterization

Science.gov (United States)

von Hartmann, Hartwig; Buness, Hermann; Dussel, Michael

2017-04-01

The Upper Jurassic Carbonate platform in Southern Germany is an important aquifer for the production of geothermal energy. Several successful projects were realized during the last years. 3D-seismic surveying has been established as a standard method for reservoir analysis and the definition of well paths. A project funded by the federal ministry of economic affairs and energy (BMWi) started in 2015 is a milestone for an exclusively regenerative heat energy supply of Munich. A 3D-seismic survey of 170 square kilometer was acquired and a scientific program was established to analyze the facies distribution within the area (http://www.liag-hannover.de/en/fsp/ge/geoparamol.html). Targets are primarily fault zones where one expect higher flow rates than within the undisturbed carbonate sediments. However, since a dense net of geothermal plants and wells will not always find appropriate fault areas, the reservoir properties should be analyzed in more detail, e.g. changing the viewpoint to karst features and facies distribution. Actual facies interpretation concepts are based on the alternation of massif and layered carbonates. Because of successive erosion of the ancient land surfaces, the interpretation of reefs, being an important target, is often difficult. We found that seismic sequence stratigraphy can explain the distribution of seismic pattern and improves the analysis of different facies. We supported this method by applying wavelet transformation of seismic data. The splitting of the seismic signal into successive parts of different bandwidths, especially the frequency content of the seismic signal, changed by tuning or dispersion, is extracted. The combination of different frequencies reveals a partition of the platform laterally as well as vertically. A cluster analysis of the wavelet coefficients further improves this picture. The interpretation shows a division into ramp, inner platform and trough, which were shifted locally and overprinted in time by other
Cluster analysis of track structure

International Nuclear Information System (INIS)

Michalik, V.

1991-01-01

One of the possibilities of classifying track structures is application of conventional partition techniques of analysis of multidimensional data to the track structure. Using these cluster algorithms this paper attempts to find characteristics of radiation reflecting the spatial distribution of ionizations in the primary particle track. An absolute frequency distribution of clusters of ionizations giving the mean number of clusters produced by radiation per unit of deposited energy can serve as this characteristic. General computation techniques used as well as methods of calculations of distributions of clusters for different radiations are discussed. 8 refs.; 5 figs
Are clusters of dietary patterns and cluster membership stable over time? Results of a longitudinal cluster analysis study.

Science.gov (United States)

Walthouwer, Michel Jean Louis; Oenema, Anke; Soetens, Katja; Lechner, Lilian; de Vries, Hein

2014-11-01

Developing nutrition education interventions based on clusters of dietary patterns can only be done adequately when it is clear if distinctive clusters of dietary patterns can be derived and reproduced over time, if cluster membership is stable, and if it is predictable which type of people belong to a certain cluster. Hence, this study aimed to: (1) identify clusters of dietary patterns among Dutch adults, (2) test the reproducibility of these clusters and stability of cluster membership over time, and (3) identify sociodemographic predictors of cluster membership and cluster transition. This study had a longitudinal design with online measurements at baseline (N=483) and 6 months follow-up (N=379). Dietary intake was assessed with a validated food frequency questionnaire. A hierarchical cluster analysis was performed, followed by a K-means cluster analysis. Multinomial logistic regression analyses were conducted to identify the sociodemographic predictors of cluster membership and cluster transition. At baseline and follow-up, a comparable three-cluster solution was derived, distinguishing a healthy, moderately healthy, and unhealthy dietary pattern. Male and lower educated participants were significantly more likely to have a less healthy dietary pattern. Further, 251 (66.2%) participants remained in the same cluster, 45 (11.9%) participants changed to an unhealthier cluster, and 83 (21.9%) participants shifted to a healthier cluster. Men and people living alone were significantly more likely to shift toward a less healthy dietary pattern. Distinctive clusters of dietary patterns can be derived. Yet, cluster membership is unstable and only few sociodemographic factors were associated with cluster membership and cluster transition. These findings imply that clusters based on dietary intake may not be suitable as a basis for nutrition education interventions. Copyright © 2014 Elsevier Ltd. All rights reserved.

CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

Science.gov (United States)

Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

2017-08-31

Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
MANNER OF STOCKS SORTING USING CLUSTER ANALYSIS METHODS

Directory of Open Access Journals (Sweden)

Jana Halčinová

2014-06-01

Full Text Available The aim of the present article is to show the possibility of using the methods of cluster analysis in classification of stocks of finished products. Cluster analysis creates groups (clusters of finished products according to similarity in demand i.e. customer requirements for each product. Manner stocks sorting of finished products by clusters is described a practical example. The resultants clusters are incorporated into the draft layout of the distribution warehouse.
A survey on platforms for big data analytics.

Science.gov (United States)

Singh, Dilpreet; Reddy, Chandan K

The primary purpose of this paper is to provide an in-depth analysis of different platforms available for performing big data analytics. This paper surveys different hardware platforms available for big data analytics and assesses the advantages and drawbacks of each of these platforms based on various metrics such as scalability, data I/O rate, fault tolerance, real-time processing, data size supported and iterative task support. In addition to the hardware, a detailed description of the software frameworks used within each of these platforms is also discussed along with their strengths and drawbacks. Some of the critical characteristics described here can potentially aid the readers in making an informed decision about the right choice of platforms depending on their computational needs. Using a star ratings table, a rigorous qualitative comparison between different platforms is also discussed for each of the six characteristics that are critical for the algorithms of big data analytics. In order to provide more insights into the effectiveness of each of the platform in the context of big data analytics, specific implementation level details of the widely used k-means clustering algorithm on various platforms are also described in the form pseudocode.
Exact WKB analysis and cluster algebras

International Nuclear Information System (INIS)

Iwaki, Kohei; Nakanishi, Tomoki

2014-01-01

We develop the mutation theory in the exact WKB analysis using the framework of cluster algebras. Under a continuous deformation of the potential of the Schrödinger equation on a compact Riemann surface, the Stokes graph may change the topology. We call this phenomenon the mutation of Stokes graphs. Along the mutation of Stokes graphs, the Voros symbols, which are monodromy data of the equation, also mutate due to the Stokes phenomenon. We show that the Voros symbols mutate as variables of a cluster algebra with surface realization. As an application, we obtain the identities of Stokes automorphisms associated with periods of cluster algebras. The paper also includes an extensive introduction of the exact WKB analysis and the surface realization of cluster algebras for nonexperts. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Cluster algebras in mathematical physics’. (paper)
Study on integrated design and analysis platform of NPP

International Nuclear Information System (INIS)

Lu Dongsen; Gao Zuying; Zhou Zhiwei

2001-01-01

Many calculation software have been developed to nuclear system's design and safety analysis, such as structure design software, fuel design and manage software, thermal hydraulic analysis software, severe accident simulation software, etc. This study integrates those software to a platform, develops visual modeling tool for Retran, NGFM90. And in this platform, a distribution calculation method is also provided for couple calculation between different software. The study will improve the design and analysis of NPP
Comparing the performance of biomedical clustering methods

DEFF Research Database (Denmark)

Wiwie, Christian; Baumbach, Jan; Röttger, Richard

2015-01-01

expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future......Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene....... This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide...
Spiking neural networks on high performance computer clusters

Science.gov (United States)

Chen, Chong; Taha, Tarek M.

2011-09-01

In this paper we examine the acceleration of two spiking neural network models on three clusters of multicore processors representing three categories of processors: x86, STI Cell, and NVIDIA GPGPUs. The x86 cluster utilized consists of 352 dualcore AMD Opterons, the Cell cluster consists of 320 Sony Playstation 3s, while the GPGPU cluster contains 32 NVIDIA Tesla S1070 systems. The results indicate that the GPGPU platform can dominate in performance compared to the Cell and x86 platforms examined. From a cost perspective, the GPGPU is more expensive in terms of neuron/s throughput. If the cost of GPGPUs go down in the future, this platform will become very cost effective for these models.
From virtual clustering analysis to self-consistent clustering analysis: a mathematical study

Science.gov (United States)

Tang, Shaoqiang; Zhang, Lei; Liu, Wing Kam

2018-03-01

In this paper, we propose a new homogenization algorithm, virtual clustering analysis (VCA), as well as provide a mathematical framework for the recently proposed self-consistent clustering analysis (SCA) (Liu et al. in Comput Methods Appl Mech Eng 306:319-341, 2016). In the mathematical theory, we clarify the key assumptions and ideas of VCA and SCA, and derive the continuous and discrete Lippmann-Schwinger equations. Based on a key postulation of "once response similarly, always response similarly", clustering is performed in an offline stage by machine learning techniques (k-means and SOM), and facilitates substantial reduction of computational complexity in an online predictive stage. The clear mathematical setup allows for the first time a convergence study of clustering refinement in one space dimension. Convergence is proved rigorously, and found to be of second order from numerical investigations. Furthermore, we propose to suitably enlarge the domain in VCA, such that the boundary terms may be neglected in the Lippmann-Schwinger equation, by virtue of the Saint-Venant's principle. In contrast, they were not obtained in the original SCA paper, and we discover these terms may well be responsible for the numerical dependency on the choice of reference material property. Since VCA enhances the accuracy by overcoming the modeling error, and reduce the numerical cost by avoiding an outer loop iteration for attaining the material property consistency in SCA, its efficiency is expected even higher than the recently proposed SCA algorithm.
High-Level Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms

KAUST Repository

Hasanov, Khalid

2014-01-01

There has been a significant research in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research works are done to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a very simple and at the same time general approach to optimize legacy MPI broadcast algorithms, which are widely used in MPICH and OpenMPI. Theoretical analysis and experimental results on IBM BlueGene/P and a cluster of Grid’5000 platform are presented.
Benchmarking computer platforms for lattice QCD applications

International Nuclear Information System (INIS)

Hasenbusch, M.; Jansen, K.; Pleiter, D.; Stueben, H.; Wegner, P.; Wettig, T.; Wittig, H.

2004-01-01

We define a benchmark suite for lattice QCD and report on benchmark results from several computer platforms. The platforms considered are apeNEXT, CRAY T3E; Hitachi SR8000, IBM p690, PC-Clusters, and QCDOC
Automics: an integrated platform for NMR-based metabonomics spectral processing and data analysis

Directory of Open Access Journals (Sweden)

Qu Lijia

2009-03-01

Full Text Available Abstract Background Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public. Results In this paper, we report an open source software tool, Automics, which is specifically designed for NMR-based metabonomics studies. Automics is a highly integrated platform that provides functions covering almost all the stages of NMR-based metabonomics studies. Automics provides high throughput automatic modules with most recently proposed algorithms and powerful manual modules for 1D NMR spectral processing. In addition to spectral processing functions, powerful features for data organization, data pre-processing, and data analysis have been implemented. Nine statistical methods can be applied to analyses including: feature selection (Fisher's criterion, data reduction (PCA, LDA, ULDA, unsupervised clustering (K-Mean and supervised regression and classification (PLS/PLS-DA, KNN, SIMCA, SVM. Moreover, Automics has a user-friendly graphical interface for visualizing NMR spectra and data analysis results. The functional ability of Automics is demonstrated with an analysis of a type 2 diabetes metabolic profile. Conclusion Automics facilitates high throughput 1D NMR spectral processing and high dimensional data analysis for NMR-based metabonomics applications. Using Automics, users can complete spectral processing and data analysis within one software package in most cases
Automics: an integrated platform for NMR-based metabonomics spectral processing and data analysis.

Science.gov (United States)

Wang, Tao; Shao, Kang; Chu, Qinying; Ren, Yanfei; Mu, Yiming; Qu, Lijia; He, Jie; Jin, Changwen; Xia, Bin

2009-03-16

Spectral processing and post-experimental data analysis are the major tasks in NMR-based metabonomics studies. While there are commercial and free licensed software tools available to assist these tasks, researchers usually have to use multiple software packages for their studies because software packages generally focus on specific tasks. It would be beneficial to have a highly integrated platform, in which these tasks can be completed within one package. Moreover, with open source architecture, newly proposed algorithms or methods for spectral processing and data analysis can be implemented much more easily and accessed freely by the public. In this paper, we report an open source software tool, Automics, which is specifically designed for NMR-based metabonomics studies. Automics is a highly integrated platform that provides functions covering almost all the stages of NMR-based metabonomics studies. Automics provides high throughput automatic modules with most recently proposed algorithms and powerful manual modules for 1D NMR spectral processing. In addition to spectral processing functions, powerful features for data organization, data pre-processing, and data analysis have been implemented. Nine statistical methods can be applied to analyses including: feature selection (Fisher's criterion), data reduction (PCA, LDA, ULDA), unsupervised clustering (K-Mean) and supervised regression and classification (PLS/PLS-DA, KNN, SIMCA, SVM). Moreover, Automics has a user-friendly graphical interface for visualizing NMR spectra and data analysis results. The functional ability of Automics is demonstrated with an analysis of a type 2 diabetes metabolic profile. Automics facilitates high throughput 1D NMR spectral processing and high dimensional data analysis for NMR-based metabonomics applications. Using Automics, users can complete spectral processing and data analysis within one software package in most cases. Moreover, with its open source architecture, interested
Benchmarking computer platforms for lattice QCD applications

International Nuclear Information System (INIS)

Hasenbusch, M.; Jansen, K.; Pleiter, D.; Wegner, P.; Wettig, T.

2003-09-01

We define a benchmark suite for lattice QCD and report on benchmark results from several computer platforms. The platforms considered are apeNEXT, CRAY T3E, Hitachi SR8000, IBM p690, PC-Clusters, and QCDOC. (orig.)
A multimarker qPCR platform for the characterisation of endometrial cancer.

Science.gov (United States)

Supernat, Anna; Łapińska-Szumczyk, Sylwia; Majewska, Hanna; Gulczyński, Jacek; Biernat, Wojciech; Wydra, Dariusz; Zaczek, Anna J

2014-02-01

The molecular background of endometrial cancer (EC) has not been fully elucidated. In the present study, we developed a quantitative PCR (qPCR) platform to examine the gene dosages of the potential molecular markers MGB1, TOP2A, ERBB1-4, MYC, CCND1, ESR1 and PI3K. The platform was applied in samples collected from 157 EC patients (stage I-IV) to verify its clinical utility and to examine the diagnostic and prognostic significance of the analysed biomarkers. The gene dosage pattern of the ERBB family and its downstream effectors PI3K and MYC showed particularly strong correlations with clinicopathological data. The ERBB PI3K/Akt pathway was upregulated in 31 (20%) of 156 cases. Activation of the ERBB PI3K/Akt pathway was positively correlated with a higher stage (p=0.001), higher grade (p=0.001), histological type II disease (p=0.0003) and metastases (p=0.02). The implemented hierarchical clustering revealed that cluster 2 was characterised by high copy numbers of the studied genes. Cluster 2 was associated with shorter overall survival (p=0.05). The platform was found to be a fast and simple method for direct analysis of the genes involved in uterine carcinogenesis, making it feasible for EC biology characterisation.
GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

Science.gov (United States)

Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

2011-08-01

Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.
IoT Platforms: Analysis for Building Projects

Directory of Open Access Journals (Sweden)

Rusu Liviu DUMITRU

2017-01-01

Full Text Available This paper presents a general survey of IoT platforms in terms of features for IoT project de-velopers. I will briefly summarize the state of knowledge in terms of technology regarding “In-ternet of Things” first steps in developing this technology, history, trends, sensors and micro-controllers used. I have evaluated a number of 5 IoT platforms in terms of the features needed to develop a IoT project. I have listed those components that are most appreciated by IoT pro-ject developers and the results have been highlighted in a comparative analysis of these plat-forms from the point of view of IoT project developers and which are strictly necessary as a de-velopment environment for an IoT project based. I’ve also considered the users' views of such platforms in terms of functionality, advantages, disadvantages and dangers presented by this technology.
A WEB-BASED SOLUTION TO VISUALIZE OPERATIONAL MONITORING LINUX CLUSTER FOR THE PROTODUNE DATA QUALITY MONITORING CLUSTER

CERN Document Server

Mosesane, Badisa

2017-01-01

The Neutrino computing cluster made of 300 Dell PowerEdge 1950 U1 nodes serves an integral role to the CERN Neutrino Platform (CENF). It represents an effort to foster fundamental research in the field of Neutrino physics as it provides data processing facility. We cannot begin to over emphasize the need for data quality monitoring coupled with automating system configurations and remote monitoring of the cluster. To achieve these, a software stack has been chosen to implement automatic propagation of configurations across all the nodes in the cluster. The bulk of these discusses and delves more into the automated configuration management system on this cluster to enable the fast online data processing and Data Quality (DQM) process for the Neutrino Platform cluster (npcmp.cern.ch).
Robust cluster analysis and variable selection

CERN Document Server

Ritter, Gunter

2014-01-01

Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of bot
Design of a Golf Swing Injury Detection and Evaluation open service platform with Ontology-oriented clustering case-based reasoning mechanism.

Science.gov (United States)

Ku, Hao-Hsiang

2015-01-01

Nowadays, people can easily use a smartphone to get wanted information and requested services. Hence, this study designs and proposes a Golf Swing Injury Detection and Evaluation open service platform with Ontology-oritened clustering case-based reasoning mechanism, which is called GoSIDE, based on Arduino and Open Service Gateway initative (OSGi). GoSIDE is a three-tier architecture, which is composed of Mobile Users, Application Servers and a Cloud-based Digital Convergence Server. A mobile user is with a smartphone and Kinect sensors to detect the user's Golf swing actions and to interact with iDTV. An application server is with Intelligent Golf Swing Posture Analysis Model (iGoSPAM) to check a user's Golf swing actions and to alter this user when he is with error actions. Cloud-based Digital Convergence Server is with Ontology-oriented Clustering Case-based Reasoning (CBR) for Quality of Experiences (OCC4QoE), which is designed to provide QoE services by QoE-based Ontology strategies, rules and events for this user. Furthermore, GoSIDE will automatically trigger OCC4QoE and deliver popular rules for a new user. Experiment results illustrate that GoSIDE can provide appropriate detections for Golfers. Finally, GoSIDE can be a reference model for researchers and engineers.
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.

Science.gov (United States)

Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun

2017-12-01

Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at

Cross-platform validation and analysis environment for particle physics

Science.gov (United States)

Chekanov, S. V.; Pogrebnyak, I.; Wilbern, D.

2017-11-01

A multi-platform validation and analysis framework for public Monte Carlo simulation for high-energy particle collisions is discussed. The front-end of this framework uses the Python programming language, while the back-end is written in Java, which provides a multi-platform environment that can be run from a web browser and can easily be deployed at the grid sites. The analysis package includes all major software tools used in high-energy physics, such as Lorentz vectors, jet algorithms, histogram packages, graphic canvases, and tools for providing data access. This multi-platform software suite, designed to minimize OS-specific maintenance and deployment time, is used for online validation of Monte Carlo event samples through a web interface.
AZTLAN platform: Mexican platform for analysis and design of nuclear reactors

International Nuclear Information System (INIS)

Gomez T, A. M.; Puente E, F.; Del Valle G, E.; Francois L, J. L.; Martin del Campo M, C.; Espinosa P, G.

2014-10-01

The Aztlan platform Project is a national initiative led by the Instituto Nacional de Investigaciones Nucleares (ININ) which brings together the main public houses of higher studies in Mexico, such as: Instituto Politecnico Nacional, Universidad Nacional Autonoma de Mexico and Universidad Autonoma Metropolitana in an effort to take a significant step toward the calculation autonomy and analysis that seeks to place Mexico in the medium term in a competitive international level on software issues for analysis of nuclear reactors. This project aims to modernize, improve and integrate the neutron, thermal-hydraulic and thermo-mechanical codes, developed in Mexican institutions, within an integrated platform, developed and maintained by Mexican experts to benefit from the same institutions. This project is financed by the mixed fund SENER-CONACYT of Energy Sustain ability, and aims to strengthen substantially to research institutions, such as educational institutions contributing to the formation of highly qualified human resources in the area of analysis and design of nuclear reactors. As innovative part the project includes the creation of a user group, made up of members of the project institutions as well as the Comision Nacional de Seguridad Nuclear y Salvaguardias, Central Nucleoelectrica de Laguna Verde (CNLV), Secretaria de Energia (Mexico) and Karlsruhe Institute of Technology (Germany) among others. This user group will be responsible for using the software and provide feedback to the development equipment in order that progress meets the needs of the regulator and industry; in this case the CNLV. Finally, in order to bridge the gap between similar developments globally, they will make use of the latest super computing technology to speed up calculation times. This work intends to present to national nuclear community the project, so a description of the proposed methodology is given, as well as the goals and objectives to be pursued for the development of the
Cluster analysis

OpenAIRE

Mucha, Hans-Joachim; Sofyan, Hizir

2000-01-01

As an explorative technique, duster analysis provides a description or a reduction in the dimension of the data. It classifies a set of observations into two or more mutually exclusive unknown groups based on combinations of many variables. Its aim is to construct groups in such a way that the profiles of objects in the same groups are relatively homogenous whereas the profiles of objects in different groups are relatively heterogeneous. Clustering is distinct from classification techniques, ...
Envri Cluster - a Community-Driven Platform of European Environmental Researcher Infrastructures for Providing Common E-Solutions for Earth Science

Science.gov (United States)

Asmi, A.; Sorvari, S.; Kutsch, W. L.; Laj, P.

2017-12-01

European long-term environmental research infrastructures (often referred as ESFRI RIs) are the core facilities for providing services for scientists in their quest for understanding and predicting the complex Earth system and its functioning that requires long-term efforts to identify environmental changes (trends, thresholds and resilience, interactions and feedbacks). Many of the research infrastructures originally have been developed to respond to the needs of their specific research communities, however, it is clear that strong collaboration among research infrastructures is needed to serve the trans-boundary research requires exploring scientific questions at the intersection of different scientific fields, conducting joint research projects and developing concepts, devices, and methods that can be used to integrate knowledge. European Environmental research infrastructures have already been successfully worked together for many years and have established a cluster - ENVRI cluster - for their collaborative work. ENVRI cluster act as a collaborative platform where the RIs can jointly agree on the common solutions for their operations, draft strategies and policies and share best practices and knowledge. Supporting project for the ENVRI cluster, ENVRIplus project, brings together 21 European research infrastructures and infrastructure networks to work on joint technical solutions, data interoperability, access management, training, strategies and dissemination efforts. ENVRI cluster act as one stop shop for multidisciplinary RI users, other collaborative initiatives, projects and programmes and coordinates and implement jointly agreed RI strategies.
Analysis of motion of the three wheeled mobile platform

Directory of Open Access Journals (Sweden)

Jaskot Anna

2018-01-01

Full Text Available The work is dedicated to the designing motion of the three wheeled mobile platform under the unsteady conditions. In this paper the results of the analysis based on the dynamics model of the three wheeled mobile robot, with two rear wheels and one front wheel has been included The prototype has been developed by the author's construction assumptions that is useful to realize the motion of the platform in a various configurations of wheel drives, including control of the active forces and the direction of their settings while driving. Friction forces, in longitudinal and in the transverse directions, are considered in the proposed model. Relation between friction and active forces are also included. The motion parameters of the mobile platform has been determined by adopting classical approach of mechanics. The formulated initial problem of platform motion has been solved numerically using the Runge-Kutta method of the fourth order. Results of motion analysis with motion parameters values are determined and sample results are presented.
2011 Biomass Program Platform Peer Review: Analysis

Energy Technology Data Exchange (ETDEWEB)

Haq, Zia [Office of Energy Efficiency and Renewable Energy (EERE), Washington, DC (United States)

2012-02-01

This document summarizes the recommendations and evaluations provided by an independent external panel of experts at the 2011 U.S. Department of Energy Biomass Program’s Analysis Platform Review meeting.
Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing

Science.gov (United States)

Tang, Jingyin; Matyas, Corene J.

2018-02-01

Big Data in geospatial technology is a grand challenge for processing capacity. The ability to use a GIS for geospatial analysis on Cloud Computing and High Performance Computing (HPC) clusters has emerged as a new approach to provide feasible solutions. However, users lack the ability to migrate existing research tools to a Cloud Computing or HPC-based environment because of the incompatibility of the market-dominating ArcGIS software stack and Linux operating system. This manuscript details a cross-platform geospatial library "arc4nix" to bridge this gap. Arc4nix provides an application programming interface compatible with ArcGIS and its Python library "arcpy". Arc4nix uses a decoupled client-server architecture that permits geospatial analytical functions to run on the remote server and other functions to run on the native Python environment. It uses functional programming and meta-programming language to dynamically construct Python codes containing actual geospatial calculations, send them to a server and retrieve results. Arc4nix allows users to employ their arcpy-based script in a Cloud Computing and HPC environment with minimal or no modification. It also supports parallelizing tasks using multiple CPU cores and nodes for large-scale analyses. A case study of geospatial processing of a numerical weather model's output shows that arcpy scales linearly in a distributed environment. Arc4nix is open-source software.
End-to-end integrated security and performance analysis on the DEGAS Choreographer platform

DEFF Research Database (Denmark)

Buchholtz, Mikael; Gilmore, Stephen; Haenel, Valentin

2005-01-01

We present a software tool platform which facilitates security and performance analysis of systems which starts and ends with UML model descriptions. A UML project is presented to the platform for analysis, formal content is extracted in the form of process calculi descriptions, analysed with the......We present a software tool platform which facilitates security and performance analysis of systems which starts and ends with UML model descriptions. A UML project is presented to the platform for analysis, formal content is extracted in the form of process calculi descriptions, analysed...
New Structural Representation and Digital-Analysis Platform for Symmetrical Parallel Mechanisms

Directory of Open Access Journals (Sweden)

Wenao Cao

2013-05-01

Full Text Available Abstract An automatic design platform capable of automatic structural analysis, structural synthesis and the application of parallel mechanisms will be a great aid in the conceptual design of mechanisms, though up to now such a platform has only existed as an idea. The work in this paper constitutes part of such a platform. Based on the screw theory and a new structural representation method proposed here which builds a one-to-one correspondence between the strings of representative characters and the kinematic structures of symmetrical parallel mechanisms (SPMs, this paper develops a fully-automatic approach for mobility (degree-of-freedom analysis, and further establishes an automatic digital-analysis platform for SPMs. With this platform, users simply have to enter the strings of representative characters, and the kinematic structures of the SPMs will be generated and displayed automatically, and the mobility and its properties will also be analysed and displayed automatically. Typical examples are provided to show the effectiveness of the approach.
Novel Biochip Platform for Nucleic Acid Analysis

Directory of Open Access Journals (Sweden)

Juan J. Diaz-Mochon

2012-06-01

Full Text Available This manuscript describes the use of a novel biochip platform for the rapid analysis/identification of nucleic acids, including DNA and microRNAs, with very high specificity. This approach combines a unique dynamic chemistry approach for nucleic acid testing and analysis developed by DestiNA Genomics with the STMicroelectronics In-Check platform, which comprises two microfluidic optimized and independent PCR reaction chambers, and a sequential microarray area for nucleic acid capture and identification by fluorescence. With its compact bench-top “footprint” requiring only a single technician to operate, the biochip system promises to transform and expand routine clinical diagnostic testing and screening for genetic diseases, cancers, drug toxicology and heart disease, as well as employment in the emerging companion diagnostics market.
Photoproduced fluorescent Au(I)@(Ag2/Ag3)-thiolate giant cluster: an intriguing sensing platform for DMSO and Pb(II).

Science.gov (United States)

Ganguly, Mainak; Mondal, Chanchal; Jana, Jayasmita; Pal, Anjali; Pal, Tarasankar

2014-01-14

Synergistic evolution of fluorescent Au(I)@(Ag2/Ag3)-thiolate core-shell particles has been made possible under the Sun in presence of the respective precursor coinage metal compounds and glutathione (GSH). The green chemically synthesized fluorescent clusters are giant (∼600 nm) in size and robust. Among all the common water miscible solvents, exclusively DMSO exhibits selective fluorescence quenching (Turn Off) because of the removal of GSH from the giant cluster. Again, only Pb(II) ion brings back the lost fluorescence (Turn On) leaving aside all other metal ions. This happens owing to the strong affinity of the sulfur donor of DMSO for Pb(II). Thus, employing the aqueous solution containing the giant cluster, we can detect DMSO contamination in water bodies at trace level. Besides, a selective sensing platform has emerged out for Pb(II) ion with a detection limit of 14 × 10(-8) M. Pb(II) induced fluorescence recovery is again vanished by I(-) implying a promising route to sense I(-) ion.
Micro and nano-platforms for biological cell analysis

DEFF Research Database (Denmark)

Svendsen, Winnie Edith; Castillo, Jaime; Moresco, Jacob Lange

2011-01-01

In this paper some technological platforms developed for biological cell analysis will be presented and compared to existing systems. In brief, we present a novel micro cell culture chamber based on diffusion feeding of cells, into which cells can be introduced and extracted after culturing using...... from the cells, while passive modifications involve the presence of a peptide nanotube based scaffold for the cell culturing that mimics the in vivo environment. Two applications involving fluorescent in situ hybridization (FISH) analysis and cancer cell sorting are presented, as examples of further...... analysis that can be done after cell culturing. A platform able to automate the entire process from cell culturing to cell analysis by means of simple plug and play of various self-contained, individually fabricated modules is finally described....
Analysis of Plant Breeding on Hadoop and Spark

Directory of Open Access Journals (Sweden)

Shuangxi Chen

2016-01-01

Full Text Available Analysis of crop breeding technology is one of the important means of computer-assisted breeding techniques which have huge data, high dimensions, and a lot of unstructured data. We propose a crop breeding data analysis platform on Spark. The platform consists of Hadoop distributed file system (HDFS and cluster based on memory iterative components. With this cluster, we achieve crop breeding large data analysis tasks in parallel through API provided by Spark. By experiments and tests of Indica and Japonica rice traits, plant breeding analysis platform can significantly improve the breeding of big data analysis speed, reducing the workload of concurrent programming.
Topology-oblivious optimization of MPI broadcast algorithms on extreme-scale platforms

KAUST Repository

Hasanov, Khalid

2015-11-01

© 2015 Elsevier B.V. All rights reserved. Significant research has been conducted in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research efforts aim to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a simple but general approach to optimization of the legacy MPI broadcast algorithms, which are widely used in MPICH and Open MPI. The proposed optimization technique is designed to address the challenge of extreme scale of future HPC platforms. It is based on hierarchical transformation of the traditionally flat logical arrangement of communicating processors. Theoretical analysis and experimental results on IBM BlueGene/P and a cluster of the Grid\\'5000 platform are presented.
OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

Science.gov (United States)

Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...
DATA ANALYSIS BY SQL-MAPREDUCE PLATFORM

Directory of Open Access Journals (Sweden)

A. A. A. Dergachev

2014-01-01

Full Text Available The paper deals with the problems related to the usage of relational database management system (RDBMS, mainly in the analysis of large data content, including data analysis based on web services in the Internet. A solution of these problems can be represented as a web-oriented distributed system of the data analysis with the processor of service requests as an executive kernel. The functions of such system are similar to the functions of relational DBMS, only with the usage of web services. The processor of service requests is responsible for planning of data analysis web services calls and their execution. The efficiency of such web-oriented system depends on the efficiency of web services calls plan and their program implementation where the basic element is the facilities of analyzed data storage – relational DBMS. The main attention is given to extension of functionality of relational DBMS for the analysis of large data content, in particular, the perspective estimation of web services data analysis implementation on the basis of SQL/MapReduce platform. With a view of obtaining this result, analytical task was chosen as an application-oriented part, typical for data analysis in various social networks and web portals, based on data analysis of users’ attendance. In the practical part of this research the algorithm for planning of web services calls was implemented for application-oriented task solution. SQL/MapReduce platform efficiency is confirmed by experimental results that show the opportunity of effective application for data analysis web services.
antiSMASH 2.0-a versatile platform for genome mining of secondary metabolite producers

NARCIS (Netherlands)

Blin, Kai; Medema, Marnix H.; Kazempour, Daniyal; Fischbach, Michael A.; Breitling, Rainer; Takano, Eriko; Weber, Tilmann

Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that
Fluidics platform and method for sample preparation and analysis

Science.gov (United States)

Benner, W. Henry; Dzenitis, John M.; Bennet, William J.; Baker, Brian R.

2014-08-19

Herein provided are fluidics platform and method for sample preparation and analysis. The fluidics platform is capable of analyzing DNA from blood samples using amplification assays such as polymerase-chain-reaction assays and loop-mediated-isothermal-amplification assays. The fluidics platform can also be used for other types of assays and analyzes. In some embodiments, a sample in a sealed tube can be inserted directly. The following isolation, detection, and analyzes can be performed without a user's intervention. The disclosed platform may also comprises a sample preparation system with a magnetic actuator, a heater, and an air-drying mechanism, and fluid manipulation processes for extraction, washing, elution, assay assembly, assay detection, and cleaning after reactions and between samples.
The Earth Observation Technology Cluster

Science.gov (United States)

Aplin, P.; Boyd, D. S.; Danson, F. M.; Donoghue, D. N. M.; Ferrier, G.; Galiatsatos, N.; Marsh, A.; Pope, A.; Ramirez, F. A.; Tate, N. J.

2012-07-01

The Earth Observation Technology Cluster is a knowledge exchange initiative, promoting development, understanding and communication about innovative technology used in remote sensing of the terrestrial or land surface. This initiative provides an opportunity for presentation of novel developments from, and cross-fertilisation of ideas between, the many and diverse members of the terrestrial remote sensing community. The Earth Observation Technology Cluster involves a range of knowledge exchange activities, including organisation of technical events, delivery of educational materials, publication of scientific findings and development of a coherent terrestrial EO community. The initiative as a whole covers the full range of remote sensing operation, from new platform and sensor development, through image retrieval and analysis, to data applications and environmental modelling. However, certain topical and strategic themes have been selected for detailed investigation: (1) Unpiloted Aerial Vehicles, (2) Terrestrial Laser Scanning, (3) Field-Based Fourier Transform Infra-Red Spectroscopy, (4) Hypertemporal Image Analysis, and (5) Circumpolar and Cryospheric Application. This paper presents general activities and achievements of the Earth Observation Technology Cluster, and reviews state-of-the-art developments in the five specific thematic areas.
Comparison of Resource Platform Selection Approaches for Scientific Workflows

Energy Technology Data Exchange (ETDEWEB)

Simmhan, Yogesh; Ramakrishnan, Lavanya

2010-03-05

Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from desktops and local cluster resources. At the same time, it can eliminate the challenges of restricted software environments and queue delays in shared high performance computing environments. Choosing from these diverse resource platforms for a workflow execution poses a challenge for many scientists. Scientists are often faced with deciding resource platform selection trade-offs with limited information on the actual workflows. While many workflow planning methods have explored task scheduling onto different resources, these methods often require fine-scale characterization of the workflow that is onerous for a scientist. In this position paper, we describe our early exploratory work into using blackbox characteristics to do a cost-benefit analysis across of using cloud platforms. We use only very limited high-level information on the workflow length, width, and data sizes. The length and width are indicative of the workflow duration and parallelism. The data size characterizes the IO requirements. We compare the effectiveness of this approach to other resource selection models using two exemplar scientific workflows scheduled on desktops, local clusters, HPC centers, and clouds. Early results suggest that the blackbox model often makes the same resource selections as a more fine-grained whitebox model. We believe the simplicity of the blackbox model can help inform a scientist on the applicability of cloud computing resources even before porting an existing workflow.

Three-Dimensional Scaffold Chip with Thermosensitive Coating for Capture and Reversible Release of Individual and Cluster of Circulating Tumor Cells.

Science.gov (United States)

Cheng, Shi-Bo; Xie, Min; Chen, Yan; Xiong, Jun; Liu, Ya; Chen, Zhen; Guo, Shan; Shu, Ying; Wang, Ming; Yuan, Bi-Feng; Dong, Wei-Guo; Huang, Wei-Hua

2017-08-01

Tumor metastasis is attributed to circulating tumor cells (CTC) or CTC clusters. Many strategies have hitherto been designed to isolate CTCs, but there are few methods that can capture and gently release CTC clusters as efficient as single CTCs. Herein, we developed a three-dimensional (3D) scaffold chip with thermosensitive coating for high-efficiency capture and release of individual and cluster CTCs. The 3D scaffold chip successfully combines the specific recognition and physically obstructed effect of 3D scaffold structure to significantly improve cell clusters capture efficiency. Thermosensitive gelatin hydrogel uniformly coated on the scaffold dissolves at 37 °C quickly, and the captured cells are gently released from chip with high viability. Notably, this platform was applied to isolate CTCs from cancer patients' blood samples. This allows global DNA and RNA methylation analysis of collected single CTC and CTC clusters, indicating the great potential of this platform in cancer diagnosis and downstream analysis at the molecular level.
A Cross-Platform Infrastructure for Scalable Runtime Application Performance Analysis

Energy Technology Data Exchange (ETDEWEB)

Jack Dongarra; Shirley Moore; Bart Miller, Jeffrey Hollingsworth; Tracy Rafferty

2005-03-15

The purpose of this project was to build an extensible cross-platform infrastructure to facilitate the development of accurate and portable performance analysis tools for current and future high performance computing (HPC) architectures. Major accomplishments include tools and techniques for multidimensional performance analysis, as well as improved support for dynamic performance monitoring of multithreaded and multiprocess applications. Previous performance tool development has been limited by the burden of having to re-write a platform-dependent low-level substrate for each architecture/operating system pair in order to obtain the necessary performance data from the system. Manual interpretation of performance data is not scalable for large-scale long-running applications. The infrastructure developed by this project provides a foundation for building portable and scalable performance analysis tools, with the end goal being to provide application developers with the information they need to analyze, understand, and tune the performance of terascale applications on HPC architectures. The backend portion of the infrastructure provides runtime instrumentation capability and access to hardware performance counters, with thread-safety for shared memory environments and a communication substrate to support instrumentation of multiprocess and distributed programs. Front end interfaces provides tool developers with a well-defined, platform-independent set of calls for requesting performance data. End-user tools have been developed that demonstrate runtime data collection, on-line and off-line analysis of performance data, and multidimensional performance analysis. The infrastructure is based on two underlying performance instrumentation technologies. These technologies are the PAPI cross-platform library interface to hardware performance counters and the cross-platform Dyninst library interface for runtime modification of executable images. The Paradyn and KOJAK
arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

Directory of Open Access Journals (Sweden)

Moreau Yves

2005-05-01

Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.
Clustering Trajectories by Relevant Parts for Air Traffic Analysis.

Science.gov (United States)

Andrienko, Gennady; Andrienko, Natalia; Fuchs, Georg; Garcia, Jose Manuel Cordero

2018-01-01

Clustering of trajectories of moving objects by similarity is an important technique in movement analysis. Existing distance functions assess the similarity between trajectories based on properties of the trajectory points or segments. The properties may include the spatial positions, times, and thematic attributes. There may be a need to focus the analysis on certain parts of trajectories, i.e., points and segments that have particular properties. According to the analysis focus, the analyst may need to cluster trajectories by similarity of their relevant parts only. Throughout the analysis process, the focus may change, and different parts of trajectories may become relevant. We propose an analytical workflow in which interactive filtering tools are used to attach relevance flags to elements of trajectories, clustering is done using a distance function that ignores irrelevant elements, and the resulting clusters are summarized for further analysis. We demonstrate how this workflow can be useful for different analysis tasks in three case studies with real data from the domain of air traffic. We propose a suite of generic techniques and visualization guidelines to support movement data analysis by means of relevance-aware trajectory clustering.
The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

Science.gov (United States)

Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

2017-07-01

Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

Science.gov (United States)

Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

2015-01-01

Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

Science.gov (United States)

Hwang, Heungsun; Dillon, William R.

2010-01-01

A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is…
Modal shapes optimization and feasibility analysis of NFAL platform

Directory of Open Access Journals (Sweden)

Bin WEI

2017-08-01

Full Text Available In order to avoid friction and scratching between the conveyor and the precision components when conveying object, an compact non-contact acoustic levitation prototype is designed, and the feasibility is theoretically and experimentally verified. The symmetry model is established through kinetic analysis with ANSYS. The modal and the coupled field computation at the central point of the transfer platform are simulated. The simulation results show that pure flexural or mixed flexural wave shapes appear with different wave numbers on the platform. Sweep frequency test is conducted on the compact platform prototype. The levitation experimental results confirm the feasibility of the ultrasound transfer process, the levitation frequency range and the mode of vibration. The theoretical and experimental results show that the optimal design of the modal and the carrying capacity of the driving platform is necessary according to different conditions. The research results provide a reference for the design of the mode and bandwidth of the ultrasonic levitation platform.
Analysis of human plasma metabolites across different liquid chromatography/mass spectrometry platforms: Cross-platform transferable chemical signatures.

Science.gov (United States)

Telu, Kelly H; Yan, Xinjian; Wallace, William E; Stein, Stephen E; Simón-Manso, Yamil

2016-03-15

The metabolite profiling of a NIST plasma Standard Reference Material (SRM 1950) on different liquid chromatography/mass spectrometry (LC/MS) platforms showed significant differences. Although these findings suggest caution when interpreting metabolomics results, the degree of overlap of both profiles allowed us to use tandem mass spectral libraries of recurrent spectra to evaluate to what extent these results are transferable across platforms and to develop cross-platform chemical signatures. Non-targeted global metabolite profiles of SRM 1950 were obtained on different LC/MS platforms using reversed-phase chromatography and different chromatographic scales (conventional HPLC, UHPLC and nanoLC). The data processing and the metabolite differential analysis were carried out using publically available (XCMS), proprietary (Mass Profiler Professional) and in-house software (NIST pipeline). Repeatability and intermediate precision showed that the non-targeted SRM 1950 profiling was highly reproducible when working on the same platform (relative standard deviation (RSD) HPLC, UHPLC and nanoLC) on the same platform. A substantial degree of overlap (common molecular features) was also found. A procedure to generate consistent chemical signatures using tandem mass spectral libraries of recurrent spectra is proposed. Different platforms rendered significantly different metabolite profiles, but the results were highly reproducible when working within one platform. Tandem mass spectral libraries of recurrent spectra are proposed to evaluate the degree of transferability of chemical signatures generated on different platforms. Chemical signatures based on our procedure are most likely cross-platform transferable. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.
An integrated biomedical knowledge extraction and analysis platform: using federated search and document clustering technology.

Science.gov (United States)

Taylor, Donald P

2007-01-01

High content screening (HCS) requires time-consuming and often complex iterative information retrieval and assessment approaches to optimally conduct drug discovery programs and biomedical research. Pre- and post-HCS experimentation both require the retrieval of information from public as well as proprietary literature in addition to structured information assets such as compound libraries and projects databases. Unfortunately, this information is typically scattered across a plethora of proprietary bioinformatics tools and databases and public domain sources. Consequently, single search requests must be presented to each information repository, forcing the results to be manually integrated for a meaningful result set. Furthermore, these bioinformatics tools and data repositories are becoming increasingly complex to use; typically they fail to allow for more natural query interfaces. Vivisimo has developed an enterprise software platform to bridge disparate silos of information. The platform automatically categorizes search results into descriptive folders without the use of taxonomies to drive the categorization. A new approach to information retrieval for HCS experimentation is proposed.
Semi-supervised consensus clustering for gene expression data analysis

OpenAIRE

Wang, Yunli; Pan, Youlian

2014-01-01

Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...
Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.

Science.gov (United States)

Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun

2017-12-01

Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.
Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

Science.gov (United States)

Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

2013-03-01

Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

Science.gov (United States)

Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

2016-01-01

Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.
Cluster analysis of activity-time series in motor learning

DEFF Research Database (Denmark)

Balslev, Daniela; Nielsen, Finn Å; Futiger, Sally A

2002-01-01

Neuroimaging studies of learning focus on brain areas where the activity changes as a function of time. To circumvent the difficult problem of model selection, we used a data-driven analytic tool, cluster analysis, which extracts representative temporal and spatial patterns from the voxel......-time series. The optimal number of clusters was chosen using a cross-validated likelihood method, which highlights the clustering pattern that generalizes best over the subjects. Data were acquired with PET at different time points during practice of a visuomotor task. The results from cluster analysis show...
Advanced analysis of forest fire clustering

Science.gov (United States)

Kanevski, Mikhail; Pereira, Mario; Golay, Jean

2017-04-01

Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index
Physicochemical properties of different corn varieties by principal components analysis and cluster analysis

International Nuclear Information System (INIS)

Zeng, J.; Li, G.; Sun, J.

2013-01-01

Principal components analysis and cluster analysis were used to investigate the properties of different corn varieties. The chemical compositions and some properties of corn flour which processed by drying milling were determined. The results showed that the chemical compositions and physicochemical properties were significantly different among twenty six corn varieties. The quality of corn flour was concerned with five principal components from principal component analysis and the contribution rate of starch pasting properties was important, which could account for 48.90%. Twenty six corn varieties could be classified into four groups by cluster analysis. The consistency between principal components analysis and cluster analysis indicated that multivariate analyses were feasible in the study of corn variety properties. (author)
A software platform for the analysis of dermatology images

Science.gov (United States)

Vlassi, Maria; Mavraganis, Vlasios; Asvestas, Panteleimon

2017-11-01

The purpose of this paper is to present a software platform developed in Python programming environment that can be used for the processing and analysis of dermatology images. The platform provides the capability for reading a file that contains a dermatology image. The platform supports image formats such as Windows bitmaps, JPEG, JPEG2000, portable network graphics, TIFF. Furthermore, it provides suitable tools for selecting, either manually or automatically, a region of interest (ROI) on the image. The automated selection of a ROI includes filtering for smoothing the image and thresholding. The proposed software platform has a friendly and clear graphical user interface and could be a useful second-opinion tool to a dermatologist. Furthermore, it could be used to classify images including from other anatomical parts such as breast or lung, after proper re-training of the classification algorithms.
Taxonomical analysis of the Cancer cluster of galaxies

International Nuclear Information System (INIS)

Perea, J.; Olmo, A. del; Moles, M.

1986-01-01

A description is presented of the Cancer cluster of galaxies, based on a taxonomical analysis in (α,delta, Vsub(r)) space. Earlier results by previous authors on the lack of dynamical entity of the cluster are confirmed. The present analysis points out the existence of a binary structure in the most populated region of the complex. (author)
Numeric computation and statistical data analysis on the Java platform

CERN Document Server

Chekanov, Sergei V

2016-01-01

Numerical computation, knowledge discovery and statistical data analysis integrated with powerful 2D and 3D graphics for visualization are the key topics of this book. The Python code examples powered by the Java platform can easily be transformed to other programming languages, such as Java, Groovy, Ruby and BeanShell. This book equips the reader with a computational platform which, unlike other statistical programs, is not limited by a single programming language. The author focuses on practical programming aspects and covers a broad range of topics, from basic introduction to the Python language on the Java platform (Jython), to descriptive statistics, symbolic calculations, neural networks, non-linear regression analysis and many other data-mining topics. He discusses how to find regularities in real-world data, how to classify data, and how to process data for knowledge discoveries. The code snippets are so short that they easily fit into single pages. Numeric Computation and Statistical Data Analysis ...

Fatigue Analysis of a Mono-Tower Platform

DEFF Research Database (Denmark)

Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Brincker, Rune

In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed thro...... of the natural period, damping ratio, current, stress Spectrum and parameters describing the fatigue strength. Further, soil damping is shown to be significant for the Mono-tower.......In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed...... through linear-elastic fracture mechanics (LEFM). In determining the cumulative fatigue damage, Palmgren-Miner's rule is applied. Element reliability as well as systems reliability is estimated using first-order reliability methods (FORM). The sensitivity of the systems reliability to various parameters...
Assessment of surface water quality using hierarchical cluster analysis

Directory of Open Access Journals (Sweden)

Dheeraj Kumar Dabgerwal

2016-02-01

Full Text Available This study was carried out to assess the physicochemical quality river Varuna inVaranasi,India. Water samples were collected from 10 sites during January-June 2015. Pearson correlation analysis was used to assess the direction and strength of relationship between physicochemical parameters. Hierarchical Cluster analysis was also performed to determine the sources of pollution in the river Varuna. The result showed quite high value of DO, Nitrate, BOD, COD and Total Alkalinity, above the BIS permissible limit. The results of correlation analysis identified key water parameters as pH, electrical conductivity, total alkalinity and nitrate, which influence the concentration of other water parameters. Cluster analysis identified three major clusters of sampling sites out of total 10 sites, according to the similarity in water quality. This study illustrated the usefulness of correlation and cluster analysis for getting better information about the river water quality.International Journal of Environment Vol. 5 (1 2016, pp: 32-44
Identification of the chelocardin biosynthetic gene cluster from Amycolatopsis sulphurea: a platform for producing novel tetracycline antibiotics.

Science.gov (United States)

Lukežič, Tadeja; Lešnik, Urška; Podgoršek, Ajda; Horvat, Jaka; Polak, Tomaž; Šala, Martin; Jenko, Branko; Raspor, Peter; Herron, Paul R; Hunter, Iain S; Petković, Hrvoje

2013-12-01

Tetracyclines (TCs) are medically important antibiotics from the polyketide family of natural products. Chelocardin (CHD), produced by Amycolatopsis sulphurea, is a broad-spectrum tetracyclic antibiotic with potent bacteriolytic activity against a number of Gram-positive and Gram-negative multi-resistant pathogens. CHD has an unknown mode of action that is different from TCs. It has some structural features that define it as 'atypical' and, notably, is active against tetracycline-resistant pathogens. Identification and characterization of the chelocardin biosynthetic gene cluster from A. sulphurea revealed 18 putative open reading frames including a type II polyketide synthase. Compared to typical TCs, the chd cluster contains a number of features that relate to its classification as 'atypical': an additional gene for a putative two-component cyclase/aromatase that may be responsible for the different aromatization pattern, a gene for a putative aminotransferase for C-4 with the opposite stereochemistry to TCs and a gene for a putative C-9 methylase that is a unique feature of this biosynthetic cluster within the TCs. Collectively, these enzymes deliver a molecule with different aromatization of ring C that results in an unusual planar structure of the TC backbone. This is a likely contributor to its different mode of action. In addition CHD biosynthesis is primed with acetate, unlike the TCs, which are primed with malonamate, and offers a biosynthetic engineering platform that represents a unique opportunity for efficient generation of novel tetracyclic backbones using combinatorial biosynthesis.
Analysis of Aspects of Innovation in a Brazilian Cluster

Directory of Open Access Journals (Sweden)

Adriana Valélia Saraceni

2012-09-01

Full Text Available Innovation through clustering has become very important on the increased significance that interaction represents on innovation and learning process concept. This study aims to identify whereas a case analysis on innovation process in a cluster represents on the learning process. Therefore, this study is developed in two stages. First, we used a preliminary case study verifying a cluster innovation analysis and it Innovation Index, for further, exploring a combined body of theory and practice. Further, the second stage is developed by exploring the learning process concept. Both stages allowed us building a theory model for the learning process development in clusters. The main results of the model development come up with a mechanism of improvement implementation on clusters when case studies are applied.
Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

Science.gov (United States)

de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

2006-01-01

K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…
Two-Way Regularized Fuzzy Clustering of Multiple Correspondence Analysis.

Science.gov (United States)

Kim, Sunmee; Choi, Ji Yeh; Hwang, Heungsun

2017-01-01

Multiple correspondence analysis (MCA) is a useful tool for investigating the interrelationships among dummy-coded categorical variables. MCA has been combined with clustering methods to examine whether there exist heterogeneous subclusters of a population, which exhibit cluster-level heterogeneity. These combined approaches aim to classify either observations only (one-way clustering of MCA) or both observations and variable categories (two-way clustering of MCA). The latter approach is favored because its solutions are easier to interpret by providing explicitly which subgroup of observations is associated with which subset of variable categories. Nonetheless, the two-way approach has been built on hard classification that assumes observations and/or variable categories to belong to only one cluster. To relax this assumption, we propose two-way fuzzy clustering of MCA. Specifically, we combine MCA with fuzzy k-means simultaneously to classify a subgroup of observations and a subset of variable categories into a common cluster, while allowing both observations and variable categories to belong partially to multiple clusters. Importantly, we adopt regularized fuzzy k-means, thereby enabling us to decide the degree of fuzziness in cluster memberships automatically. We evaluate the performance of the proposed approach through the analysis of simulated and real data, in comparison with existing two-way clustering approaches.
Phenotypes Determined by Cluster Analysis in Moderate to Severe Bronchial Asthma.

Science.gov (United States)

Youroukova, Vania M; Dimitrova, Denitsa G; Valerieva, Anna D; Lesichkova, Spaska S; Velikova, Tsvetelina V; Ivanova-Todorova, Ekaterina I; Tumangelova-Yuzeir, Kalina D

2017-06-01

Bronchial asthma is a heterogeneous disease that includes various subtypes. They may share similar clinical characteristics, but probably have different pathological mechanisms. To identify phenotypes using cluster analysis in moderate to severe bronchial asthma and to compare differences in clinical, physiological, immunological and inflammatory data between the clusters. Forty adult patients with moderate to severe bronchial asthma out of exacerbation were included. All underwent clinical assessment, anthropometric measurements, skin prick testing, standard spirometry and measurement fraction of exhaled nitric oxide. Blood eosinophilic count, serum total IgE and periostin levels were determined. Two-step cluster approach, hierarchical clustering method and k-mean analysis were used for identification of the clusters. We have identified four clusters. Cluster 1 (n=14) - late-onset, non-atopic asthma with impaired lung function, Cluster 2 (n=13) - late-onset, atopic asthma, Cluster 3 (n=6) - late-onset, aspirin sensitivity, eosinophilic asthma, and Cluster 4 (n=7) - early-onset, atopic asthma. Our study is the first in Bulgaria in which cluster analysis is applied to asthmatic patients. We identified four clusters. The variables with greatest force for differentiation in our study were: age of asthma onset, duration of diseases, atopy, smoking, blood eosinophils, nonsteroidal anti-inflammatory drugs hypersensitivity, baseline FEV1/FVC and symptoms severity. Our results support the concept of heterogeneity of bronchial asthma and demonstrate that cluster analysis can be an useful tool for phenotyping of disease and personalized approach to the treatment of patients.
Visualization on supercomputing platform level II ASC milestone (3537-1B) results from Sandia.

Energy Technology Data Exchange (ETDEWEB)

Geveci, Berk (Kitware, Inc., Clifton Park, NY); Fabian, Nathan; Marion, Patrick (Kitware, Inc., Clifton Park, NY); Moreland, Kenneth D.

2010-09-01

This report provides documentation for the completion of the Sandia portion of the ASC Level II Visualization on the platform milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratories and Los Alamos National Laboratories. This milestone contains functionality required for performing visualization directly on a supercomputing platform, which is necessary for peta-scale visualization. Sandia's contribution concerns in-situ visualization, running a visualization in tandem with a solver. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors(GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the performance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. Scientific simulation on parallel supercomputers is traditionally performed in four
A Comparative Analysis of MOOC (Massive Open Online Course Platforms

Directory of Open Access Journals (Sweden)

Maria CONACHE

2016-01-01

Full Text Available The MOOC platforms have known a considerable development in recent years due to the enlargement of online space and the shifting between traditional to virtual activities. These plat-forms made it possible for people almost everywhere to take online academic courses offered by top universities via open access to web and with unlimited participation. Thus, it came naturally to us to address the question what makes them so successful? The purpose of this paper is to report comparatively MOOC platforms in terms of features, based on the user’s implication and demands. First, we chose four relevant lifelong learning platforms and then we structured three main categories for the platforms' qualification, depending on which we built our theory regarding the comparison between them. Our analysis consists of three sets of criteria: business model, course design and popularity among online users. Starting from this perspective, we built a range of representative factors for which we highlighted the major aspects for each plat-form in our comparative research
Sociomaterial Practices: Challenges in Developing a Virtual Business Community Platform in Agriculture

Directory of Open Access Journals (Sweden)

Norberto Hoppen

2017-07-01

Full Text Available Virtual business communities (VBC are virtual networks of people who share common interests and comprise online software platforms that enables the fast exchange of information, collaboration and business interactions. From 2011 to 2015, we developed a design science research to create a VBC platform for an agricultural cluster of flower growers in the South of Brazil. The goal of this platform was to help to structure and strengthen this cluster by bringing together buyers and sellers while fostering cooperation to boost cluster competitiveness and economic development in the region. However, a number of challenges surfaced during the process, which led to a failure in the VBC platform’s diffusion. We adopted a sociomaterial perspective based on the mangle of practice concept (Pickering, 1993 to investigate this failure, by analyzing the key challenges involved in developing a VBC platform in an agricultural context. As its main result, this paper reveals the mangling process during the design and application of the VBC platform and details the different instances of tuning between the participants and the technology. We observed resistance and factors that weakened cooperation and resulted in a lack of governance rules, which are key to the success of a VBC platform.
Cluster analysis of rural, urban, and curbside atmospheric particle size data.

Science.gov (United States)

Beddows, David C S; Dall'Osto, Manuel; Harrison, Roy M

2009-07-01

Particle size is a key determinant of the hazard posed by airborne particles. Continuous multivariate particle size data have been collected using aerosol particle size spectrometers sited at four locations within the UK: Harwell (Oxfordshire); Regents Park (London); British Telecom Tower (London); and Marylebone Road (London). These data have been analyzed using k-means cluster analysis, deduced to be the preferred cluster analysis technique, selected from an option of four partitional cluster packages, namelythe following: Fuzzy; k-means; k-median; and Model-Based clustering. Using cluster validation indices k-means clustering was shown to produce clusters with the smallest size, furthest separation, and importantly the highest degree of similarity between the elements within each partition. Using k-means clustering, the complexity of the data set is reduced allowing characterization of the data according to the temporal and spatial trends of the clusters. At Harwell, the rural background measurement site, the cluster analysis showed that the spectra may be differentiated by their modal-diameters and average temporal trends showing either high counts during the day-time or night-time hours. Likewise for the urban sites, the cluster analysis differentiated the spectra into a small number of size distributions according their modal-diameter, the location of the measurement site, and time of day. The responsible aerosol emission, formation, and dynamic processes can be inferred according to the cluster characteristics and correlation to concurrently measured meteorological, gas phase, and particle phase measurements.
Cluster analysis for determining distribution center location

Science.gov (United States)

Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian

2017-12-01

Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.
Clustering Analysis for Credit Default Probabilities in a Retail Bank Portfolio

Directory of Open Access Journals (Sweden)

Elena ANDREI (DRAGOMIR

2012-08-01

Full Text Available Methods underlying cluster analysis are very useful in data analysis, especially when the processed volume of data is very large, so that it becomes impossible to extract essential information, unless specific instruments are used to summarize and structure the gross information. In this context, cluster analysis techniques are used particularly, for systematic information analysis. The aim of this article is to build an useful model for banking field, based on data mining techniques, by dividing the groups of borrowers into clusters, in order to obtain a profile of the customers (debtors and good payers. We assume that a class is appropriate if it contains members that have a high degree of similarity and the standard method for measuring the similarity within a group shows the lowest variance. After clustering, data mining techniques are implemented on the cluster with bad debtors, reaching a very high accuracy after implementation. The paper is structured as follows: Section 2 describes the model for data analysis based on a specific scoring model that we proposed. In section 3, we present a cluster analysis using K-means algorithm and the DM models are applied on a specific cluster. Section 4 shows the conclusions.
Cluster Analysis as an Analytical Tool of Population Policy

Directory of Open Access Journals (Sweden)

Oksana Mikhaylovna Shubat

2017-12-01

Full Text Available The predicted negative trends in Russian demography (falling birth rates, population decline actualize the need to strengthen measures of family and population policy. Our research purpose is to identify groups of Russian regions with similar characteristics in the family sphere using cluster analysis. The findings should make an important contribution to the field of family policy. We used hierarchical cluster analysis based on the Ward method and the Euclidean distance for segmentation of Russian regions. Clustering is based on four variables, which allowed assessing the family institution in the region. The authors used the data of Federal State Statistics Service from 2010 to 2015. Clustering and profiling of each segment has allowed forming a model of Russian regions depending on the features of the family institution in these regions. The authors revealed four clusters grouping regions with similar problems in the family sphere. This segmentation makes it possible to develop the most relevant family policy measures in each group of regions. Thus, the analysis has shown a high degree of differentiation of the family institution in the regions. This suggests that a unified approach to population problems’ solving is far from being effective. To achieve greater results in the implementation of family policy, a differentiated approach is needed. Methods of multidimensional data classification can be successfully applied as a relevant analytical toolkit. Further research could develop the adaptation of multidimensional classification methods to the analysis of the population problems in Russian regions. In particular, the algorithms of nonparametric cluster analysis may be of relevance in future studies.
Identifying Two Groups of Entitled Individuals: Cluster Analysis Reveals Emotional Stability and Self-Esteem Distinction.

Science.gov (United States)

Crowe, Michael L; LoPilato, Alexander C; Campbell, W Keith; Miller, Joshua D

2016-12-01

The present study hypothesized that there exist two distinct groups of entitled individuals: grandiose-entitled, and vulnerable-entitled. Self-report scores of entitlement were collected for 916 individuals using an online platform. Model-based cluster analyses were conducted on the individuals with scores one standard deviation above mean (n = 159) using the five-factor model dimensions as clustering variables. The results support the existence of two groups of entitled individuals categorized as emotionally stable and emotionally vulnerable. The emotionally stable cluster reported emotional stability, high self-esteem, more positive affect, and antisocial behavior. The emotionally vulnerable cluster reported low self-esteem and high levels of neuroticism, disinhibition, conventionality, psychopathy, negative affect, childhood abuse, intrusive parenting, and attachment difficulties. Compared to the control group, both clusters reported being more antagonistic, extraverted, Machiavellian, and narcissistic. These results suggest important differences are missed when simply examining the linear relationships between entitlement and various aspects of its nomological network.
Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

Science.gov (United States)

Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

2017-11-01

Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease
Advances in the development of the Mexican platform for analysis and design of nuclear reactors: AZTLAN Platform

International Nuclear Information System (INIS)

Gomez T, A. M.; Puente E, F.; Del Valle G, E.; Francois L, J. L.; Espinosa P, G.

2017-09-01

The AZTLAN platform project: development of a Mexican platform for the analysis and design of nuclear reactors, financed by the SENER-CONACYT Energy Sustain ability Fund, was approved in early 2014 and formally began at the end of that year. It is a national project led by the Instituto Nacional de Investigaciones Nucleares (ININ) and with the collaboration of Instituto Politecnico Nacional (IPN), the Universidad Autonoma Metropolitana (UAM) and Universidad Nacional Autonoma de Mexico (UNAM) as part of the development team and with the participation of the Laguna Verde Nuclear Power Plant, the National Commission of Nuclear Safety and Safeguards, the Ministry of Energy and the Karlsruhe Institute of Technology (Kit, Germany) as part of the user group. The general objective of the project is to modernize, improve and integrate the neutronic, thermo-hydraulic and thermo-mechanical codes, developed in Mexican institutions, in an integrated platform, developed and maintained by Mexican experts for the benefit of Mexican institutions. Two years into the process, important steps have been taken that have consolidated the platform. The main results of these first two years have been presented in different national and international forums. In this congress, some of the most recent results that have been implemented in the platform codes are shown in more detail. The current status of the platform from a more executive view point is summarized in this paper. (Author)
Automated analysis of organic particles using cluster SIMS

Energy Technology Data Exchange (ETDEWEB)

Gillen, Greg; Zeissler, Cindy; Mahoney, Christine; Lindstrom, Abigail; Fletcher, Robert; Chi, Peter; Verkouteren, Jennifer; Bright, David; Lareau, Richard T.; Boldman, Mike

2004-06-15

Cluster primary ion bombardment combined with secondary ion imaging is used on an ion microscope secondary ion mass spectrometer for the spatially resolved analysis of organic particles on various surfaces. Compared to the use of monoatomic primary ion beam bombardment, the use of a cluster primary ion beam (SF{sub 5}{sup +} or C{sub 8}{sup -}) provides significant improvement in molecular ion yields and a reduction in beam-induced degradation of the analyte molecules. These characteristics of cluster bombardment, along with automated sample stage control and custom image analysis software are utilized to rapidly characterize the spatial distribution of trace explosive particles, narcotics and inkjet-printed microarrays on a variety of surfaces.
Mining the archives: a cross-platform analysis of gene ...

Science.gov (United States)

Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for genomic research into the molecular basis of disease. However, use of FFPE samples in gene expression studies has been limited by technical challenges resulting from degradation of nucleic acids. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues using two DNA microarray protocols and two whole transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other three methods by having the highest correlations of differentially expressed genes (DEGs) and best overlap of pathways between FRO and FFPE groups. We next tested the effect of sample time in formalin (18 hours or 3 weeks) on gene expression profiles. Hierarchical clustering of the datasets indicated that test article treatment, and not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18-hour and 3-week FFPE samples compared to FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of age in FFPE block on genomic profiles. RNA-seq analysis of 8-, 19-, and 26-year-old control blocks using the ribo-depletion protocol resulted in comparable quality metrics, inc
Ontology-Based Platform for Conceptual Guided Dataset Analysis

KAUST Repository

Rodriguez-Garcia, Miguel Angel

2016-05-31

Nowadays organizations should handle a huge amount of both internal and external data from structured, semi-structured, and unstructured sources. This constitutes a major challenge (and also an opportunity) to current Business Intelligence solutions. The complexity and effort required to analyse such plethora of data implies considerable execution times. Besides, the large number of data analysis methods and techniques impede domain experts (laymen from an IT-assisted analytics perspective) to fully exploit their potential, while technology experts lack the business background to get the proper questions. In this work, we present a semantically-boosted platform for assisting layman users in (i) extracting a relevant subdataset from all the data, and (ii) selecting the data analysis technique(s) best suited for scrutinising that subdataset. The outcome is getting better answers in significantly less time. The platform has been evaluated in the music domain with promising results.

Network Analysis Tools: from biological networks to clusters and pathways.

Science.gov (United States)

Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Vanderstocken, Gilles; van Helden, Jacques

2008-01-01

Network Analysis Tools (NeAT) is a suite of computer tools that integrate various algorithms for the analysis of biological networks: comparison between graphs, between clusters, or between graphs and clusters; network randomization; analysis of degree distribution; network-based clustering and path finding. The tools are interconnected to enable a stepwise analysis of the network through a complete analytical workflow. In this protocol, we present a typical case of utilization, where the tasks above are combined to decipher a protein-protein interaction network retrieved from the STRING database. The results returned by NeAT are typically subnetworks, networks enriched with additional information (i.e., clusters or paths) or tables displaying statistics. Typical networks comprising several thousands of nodes and arcs can be analyzed within a few minutes. The complete protocol can be read and executed in approximately 1 h.
Performance analysis of clustering techniques over microarray data: A case study

Science.gov (United States)

Dash, Rasmita; Misra, Bijan Bihari

2018-03-01

Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.
State of the art of parallel scientific visualization applications on PC clusters

International Nuclear Information System (INIS)

Juliachs, M.

2004-01-01

In this state of the art on parallel scientific visualization applications on PC clusters, we deal with both surface and volume rendering approaches. We first analyze available PC cluster configurations and existing parallel rendering software components for parallel graphics rendering. CEA/DIF has been studying cluster visualization since 2001. This report is part of a study to set up a new visualization research platform. This platform consisting of an eight-node PC cluster under Linux and a tiled display was installed in collaboration with Versailles-Saint-Quentin University in August 2003. (author)
Cluster analysis of typhoid cases in Kota Bharu, Kelantan, Malaysia

Directory of Open Access Journals (Sweden)

Nazarudin Safian

2008-09-01

Full Text Available Typhoid fever is still a major public health problem globally as well as in Malaysia. This study was done to identify the spatial epidemiology of typhoid fever in the Kota Bharu District of Malaysia as a first step to developing more advanced analysis of the whole country. The main characteristic of the epidemiological pattern that interested us was whether typhoid cases occurred in clusters or whether they were evenly distributed throughout the area. We also wanted to know at what spatial distances they were clustered. All confirmed typhoid cases that were reported to the Kota Bharu District Health Department from the year 2001 to June of 2005 were taken as the samples. From the home address of the cases, the location of the house was traced and a coordinate was taken using handheld GPS devices. Spatial statistical analysis was done to determine the distribution of typhoid cases, whether clustered, random or dispersed. The spatial statistical analysis was done using CrimeStat III software to determine whether typhoid cases occur in clusters, and later on to determine at what distances it clustered. From 736 cases involved in the study there was significant clustering for cases occurring in the years 2001, 2002, 2003 and 2005. There was no significant clustering in year 2004. Typhoid clustering also occurred strongly for distances up to 6 km. This study shows that typhoid cases occur in clusters, and this method could be applicable to describe spatial epidemiology for a specific area. (Med J Indones 2008; 17: 175-82Keywords: typhoid, clustering, spatial epidemiology, GIS
Redefining the Breast Cancer Exosome Proteome by Tandem Mass Tag Quantitative Proteomics and Multivariate Cluster Analysis.

Science.gov (United States)

Clark, David J; Fondrie, William E; Liao, Zhongping; Hanson, Phyllis I; Fulton, Amy; Mao, Li; Yang, Austin J

2015-10-20

Exosomes are microvesicles of endocytic origin constitutively released by multiple cell types into the extracellular environment. With evidence that exosomes can be detected in the blood of patients with various malignancies, the development of a platform that uses exosomes as a diagnostic tool has been proposed. However, it has been difficult to truly define the exosome proteome due to the challenge of discerning contaminant proteins that may be identified via mass spectrometry using various exosome enrichment strategies. To better define the exosome proteome in breast cancer, we incorporated a combination of Tandem-Mass-Tag (TMT) quantitative proteomics approach and Support Vector Machine (SVM) cluster analysis of three conditioned media derived fractions corresponding to a 10 000g cellular debris pellet, a 100 000g crude exosome pellet, and an Optiprep enriched exosome pellet. The quantitative analysis identified 2 179 proteins in all three fractions, with known exosomal cargo proteins displaying at least a 2-fold enrichment in the exosome fraction based on the TMT protein ratios. Employing SVM cluster analysis allowed for the classification 251 proteins as "true" exosomal cargo proteins. This study provides a robust and vigorous framework for the future development of using exosomes as a potential multiprotein marker phenotyping tool that could be useful in breast cancer diagnosis and monitoring disease progression.
Cluster analysis of Southeastern U.S. climate stations

Science.gov (United States)

Stooksbury, D. E.; Michaels, P. J.

1991-09-01

A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.
Cluster analysis by optimal decomposition of induced fuzzy sets

Energy Technology Data Exchange (ETDEWEB)

Backer, E

1978-01-01

Nonsupervised pattern recognition is addressed and the concept of fuzzy sets is explored in order to provide the investigator (data analyst) additional information supplied by the pattern class membership values apart from the classical pattern class assignments. The basic ideas behind the pattern recognition problem, the clustering problem, and the concept of fuzzy sets in cluster analysis are discussed, and a brief review of the literature of the fuzzy cluster analysis is given. Some mathematical aspects of fuzzy set theory are briefly discussed; in particular, a measure of fuzziness is suggested. The optimization-clustering problem is characterized. Then the fundamental idea behind affinity decomposition is considered. Next, further analysis takes place with respect to the partitioning-characterization functions. The iterative optimization procedure is then addressed. The reclassification function is investigated and convergence properties are examined. Finally, several experiments in support of the method suggested are described. Four object data sets serve as appropriate test cases. 120 references, 70 figures, 11 tables. (RWR)
Graph analysis of cell clusters forming vascular networks

Science.gov (United States)

Alves, A. P.; Mesquita, O. N.; Gómez-Gardeñes, J.; Agero, U.

2018-03-01

This manuscript describes the experimental observation of vasculogenesis in chick embryos by means of network analysis. The formation of the vascular network was observed in the area opaca of embryos from 40 to 55 h of development. In the area opaca endothelial cell clusters self-organize as a primitive and approximately regular network of capillaries. The process was observed by bright-field microscopy in control embryos and in embryos treated with Bevacizumab (Avastin), an antibody that inhibits the signalling of the vascular endothelial growth factor (VEGF). The sequence of images of the vascular growth were thresholded, and used to quantify the forming network in control and Avastin-treated embryos. This characterization is made by measuring vessels density, number of cell clusters and the largest cluster density. From the original images, the topology of the vascular network was extracted and characterized by means of the usual network metrics such as: the degree distribution, average clustering coefficient, average short path length and assortativity, among others. This analysis allows to monitor how the largest connected cluster of the vascular network evolves in time and provides with quantitative evidence of the disruptive effects that Avastin has on the tree structure of vascular networks.
application of single-linkage clustering method in the analysis of ...

African Journals Online (AJOL)

Admin

ANALYSIS OF GROWTH RATE OF GROSS DOMESTIC PRODUCT. (GDP) AT ... The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are ..... Number of cluster sum from from observations of ...
A Flocking Based algorithm for Document Clustering Analysis

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Gao, Jinzhu [ORNL; Potok, Thomas E [ORNL

2006-01-01

Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.
MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Science.gov (United States)

Kumar, Sudhir; Stecher, Glen; Li, Michael; Knyaz, Christina; Tamura, Koichiro

2018-06-01

The Molecular Evolutionary Genetics Analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.
Topic modeling for cluster analysis of large biological and medical datasets.

Science.gov (United States)

Zhao, Weizhong; Zou, Wen; Chen, James J

2014-01-01

The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting
Fatigue Reliability Analysis of a Mono-Tower Platform

DEFF Research Database (Denmark)

Kirkegaard, Poul Henning; Sørensen, John Dalsgaard; Brincker, Rune

1991-01-01

In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed thro...... of the natural period, damping ratio, current, stress spectrum and parameters describing the fatigue strength. Further, soil damping is shown to be significant for the Mono-tower.......In this paper, a fatigue reliability analysis of a Mono-tower platform is presented. The failure mode, fatigue failure in the butt welds, is investigated with two different models. The one with the fatigue strength expressed through SN relations, the other with the fatigue strength expressed...... through linear-elastic fracture mechanics (LEFM). In determining the cumulative fatigue damage, Palmgren-Miner's rule is applied. Element reliability, as well as systems reliability, is estimated using first-order reliability methods (FORM). The sensitivity of the systems reliability to various parameters...
CLUSTER ANALYSIS UKRAINIAN REGIONAL DISTRIBUTION BY LEVEL OF INNOVATION

Directory of Open Access Journals (Sweden)

Roman Shchur

2016-07-01

Full Text Available SWOT-analysis of the threats and benefits of innovation development strategy of Ivano-Frankivsk region in the context of financial support was сonducted. Methodical approach to determine of public-private partnerships potential that is tool of innovative economic development financing was identified. Cluster analysis of possibilities of forming public-private partnership in a particular region was carried out. Optimal set of problem areas that require urgent solutions and financial security is defined on the basis of cluster approach. It will help to form practical recommendations for the formation of an effective financial mechanism in the regions of Ukraine. Key words: the mechanism of innovation development financial provision, innovation development, public-private partnerships, cluster analysis, innovative development strategy.
Multiscale visual quality assessment for cluster analysis with self-organizing maps

Science.gov (United States)

Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias

2011-01-01

Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.
Validation of tumor protein marker quantification by two independent automated immunofluorescence image analysis platforms

Science.gov (United States)

Peck, Amy R; Girondo, Melanie A; Liu, Chengbao; Kovatich, Albert J; Hooke, Jeffrey A; Shriver, Craig D; Hu, Hai; Mitchell, Edith P; Freydin, Boris; Hyslop, Terry; Chervoneva, Inna; Rui, Hallgeir

2016-01-01

Protein marker levels in formalin-fixed, paraffin-embedded tissue sections traditionally have been assayed by chromogenic immunohistochemistry and evaluated visually by pathologists. Pathologist scoring of chromogen staining intensity is subjective and generates low-resolution ordinal or nominal data rather than continuous data. Emerging digital pathology platforms now allow quantification of chromogen or fluorescence signals by computer-assisted image analysis, providing continuous immunohistochemistry values. Fluorescence immunohistochemistry offers greater dynamic signal range than chromogen immunohistochemistry, and combined with image analysis holds the promise of enhanced sensitivity and analytic resolution, and consequently more robust quantification. However, commercial fluorescence scanners and image analysis software differ in features and capabilities, and claims of objective quantitative immunohistochemistry are difficult to validate as pathologist scoring is subjective and there is no accepted gold standard. Here we provide the first side-by-side validation of two technologically distinct commercial fluorescence immunohistochemistry analysis platforms. We document highly consistent results by (1) concordance analysis of fluorescence immunohistochemistry values and (2) agreement in outcome predictions both for objective, data-driven cutpoint dichotomization with Kaplan–Meier analyses or employment of continuous marker values to compute receiver-operating curves. The two platforms examined rely on distinct fluorescence immunohistochemistry imaging hardware, microscopy vs line scanning, and functionally distinct image analysis software. Fluorescence immunohistochemistry values for nuclear-localized and tyrosine-phosphorylated Stat5a/b computed by each platform on a cohort of 323 breast cancer cases revealed high concordance after linear calibration, a finding confirmed on an independent 382 case cohort, with concordance correlation coefficients >0
The design and verification of probabilistic safety analysis platform NFRisk

International Nuclear Information System (INIS)

Hu Wenjun; Song Wei; Ren Lixia; Qian Hongtao

2010-01-01

To increase the technical ability in Probabilistic Safety Analysis (PSA) field in China,it is necessary and important to study and develop indigenous professional PSA platform. Following such principle as 'from structure simplification to modulization to production of cut sets to minimum of cut sets', the algorithms, including simplification algorithm, modulization algorithm, the algorithm of conversion from fault tree to binary decision diagram (BDD), the solving algorithm of cut sets, the minimum algorithm of cut sets, and so on, were designed and developed independently; the design of data management and operation platform was completed all alone; the verification and validation of NFRisk platform based on 3 typical fault trees was finished on our own. (authors)
Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

Science.gov (United States)

Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

2013-01-01

Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674
Cluster analysis of clinical data identifies fibromyalgia subgroups.

Directory of Open Access Journals (Sweden)

Elisa Docampo

Full Text Available INTRODUCTION: Fibromyalgia (FM is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. MATERIAL AND METHODS: 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. RESULTS: VARIABLES CLUSTERED INTO THREE INDEPENDENT DIMENSIONS: "symptomatology", "comorbidities" and "clinical scales". Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1, high symptomatology and comorbidities (Cluster 2, and high symptomatology but low comorbidities (Cluster 3, showing differences in measures of disease severity. CONCLUSIONS: We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment.
Microneedle Platforms for Cell Analysis

KAUST Repository

Kavaldzhiev, Mincho

2017-11-01

Micro-needle platforms are the core components of many recent drug delivery and gene-editing techniques, which allow for intracellular access, controlled cell membrane stress or mechanical trapping of the nucleus. This dissertation work is devoted to the development of micro-needle platforms that offer customized fabrication and new capabilities for enhanced cell analyses. The highest degree of geometrical flexibility is achieved with 3D printed micro-needles, which enable optimizing the topographical stress environment for cells and cell populations of any size. A fabrication process for 3D-printed micro-needles has been developed as well as a metal coating technique based on standard sputter deposition. This extends the functionalities of the platforms by electrical as well as magnetic features. The micro-needles have been tested on human colon cancer cells (HCT116), showing a high degree of biocompatibility of the platform. Moreover, the capabilities of the 3D-printed micro-needles have been explored for drug delivery via the well-established electroporation technique, by coating the micro-needles with gold. Antibodies and fluorescent dyes have been delivered to HCT116 cells and human embryonic kidney cells with a very high transfection rate up to 90%. In addition, the 3D-printed electroporation platform enables delivery of molecules to suspended cells or adherent cells, with or without electroporation buffer solution, and at ultra-low voltages of 2V. In order to provide a micro-needle platform that exploits existing methods for mass fabrication a custom designed template-based process has been developed. It has been used for the production of gold, iron, nickel and poly-pyrrole micro-needles on silicon and glass substrates. A novel delivery method is introduced that activates the micro-needles by electromagnetic induction, which enables to wirelessly gain intracellular access. The method has been successfully tested on HCT116 cells in culture, where a time

jClustering, an open framework for the development of 4D clustering algorithms.

Directory of Open Access Journals (Sweden)

José María Mateos-Pérez

Full Text Available We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License to allow modification if necessary.
Development of small scale cluster computer for numerical analysis

Science.gov (United States)

Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

2017-09-01

In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Systems Biology Modeling of the Radiation Sensitivity Network: A Biomarker Discovery Platform

International Nuclear Information System (INIS)

Eschrich, Steven; Zhang Hongling; Zhao Haiyan; Boulware, David; Lee, Ji-Hyun; Bloom, Gregory; Torres-Roca, Javier F.

2009-01-01

Purpose: The discovery of effective biomarkers is a fundamental goal of molecular medicine. Developing a systems-biology understanding of radiosensitivity can enhance our ability of identifying radiation-specific biomarkers. Methods and Materials: Radiosensitivity, as represented by the survival fraction at 2 Gy was modeled in 48 human cancer cell lines. We applied a linear regression algorithm that integrates gene expression with biological variables, including ras status (mut/wt), tissue of origin and p53 status (mut/wt). Results: The biomarker discovery platform is a network representation of the top 500 genes identified by linear regression analysis. This network was reduced to a 10-hub network that includes c-Jun, HDAC1, RELA (p65 subunit of NFKB), PKC-beta, SUMO-1, c-Abl, STAT1, AR, CDK1, and IRF1. Nine targets associated with radiosensitization drugs are linked to the network, demonstrating clinical relevance. Furthermore, the model identified four significant radiosensitivity clusters of terms and genes. Ras was a dominant variable in the analysis, as was the tissue of origin, and their interaction with gene expression but not p53. Overrepresented biological pathways differed between clusters but included DNA repair, cell cycle, apoptosis, and metabolism. The c-Jun network hub was validated using a knockdown approach in 8 human cell lines representing lung, colon, and breast cancers. Conclusion: We have developed a novel radiation-biomarker discovery platform using a systems biology modeling approach. We believe this platform will play a central role in the integration of biology into clinical radiation oncology practice.
Using Cluster Analysis for Data Mining in Educational Technology Research

Science.gov (United States)

Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

2012-01-01

Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…
[Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].

Science.gov (United States)

Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés

2018-03-06

To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
How to detect trap cluster systems?

International Nuclear Information System (INIS)

Mandowski, Arkadiusz

2008-01-01

Spatially correlated traps and recombination centres (trap-recombination centre pairs and larger clusters) are responsible for many anomalous phenomena that are difficult to explain in the framework of both classical models, i.e. model of localized transitions (LT) and the simple trap model (STM), even with a number of discrete energy levels. However, these 'anomalous' effects may provide a good platform for identifying trap cluster systems. This paper considers selected cluster-type effects, mainly relating to an anomalous dependence of TL on absorbed dose in the system of isolated clusters (ICs). Some consequences for interacting cluster (IAC) systems, involving both localized and delocalized transitions occurring simultaneously, are also discussed
ASAP: An Extensible Platform for State Space Analysis

DEFF Research Database (Denmark)

Westergaard, Michael; Evangelista, Sami; Kristensen, Lars Michael

2009-01-01

The ASCoVeCo State space Analysis Platform (ASAP) is a tool for performing explicit state space analysis of coloured Petri nets (CPNs) and other formalisms. ASAP supports a wide range of state space reduction techniques and is intended to be easy to extend and to use, making it a suitable tool fo...... for students, researchers, and industrial users that would like to analyze protocols and/or experiment with different algorithms. This paper presents ASAP from these two perspectives....
Analysis and experiments of a novel and compact 3-DOF precision positioning platform

International Nuclear Information System (INIS)

Huang, Hu; Zhao, Hongwei; Fan, Zunqiang; Zhang, Hui; Ma, Zhichao; Yang, Zhaojun

2013-01-01

A novel 3-DOF precision positioning platform with dimensions of 48 mm X 50 mm X 35 mm was designed by integrating piezo actuators and flexure hinges. The platform has a compact structure but it can do high precision positioning in three axes. The dynamic model of the platform in a single direction was established. Stiffness of the flexure hinges and modal characteristics of the flexure hinge mechanism were analyzed by the finite element method. Output displacements of the platform along three axes were forecasted via stiffness analysis. Output performance of the platform in x and y axes with open-loop control as well as the z-axis with closed-loop control was tested and discussed. The preliminary application of the platform in the field of nanoindentation indicates that the designed platform works well during nanoindentation tests, and the closed-loop control ensures the linear displacement output. With suitable control, the platform has the potential to realize different positioning functions under various working conditions.
Using cluster analysis to organize and explore regional GPS velocities

Science.gov (United States)

Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

2012-01-01

Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.
Methodology сomparative statistical analysis of Russian industry based on cluster analysis

Directory of Open Access Journals (Sweden)

Sergey S. Shishulin

2017-01-01

Full Text Available The article is devoted to researching of the possibilities of applying multidimensional statistical analysis in the study of industrial production on the basis of comparing its growth rates and structure with other developed and developing countries of the world. The purpose of this article is to determine the optimal set of statistical methods and the results of their application to industrial production data, which would give the best access to the analysis of the result.Data includes such indicators as output, output, gross value added, the number of employed and other indicators of the system of national accounts and operational business statistics. The objects of observation are the industry of the countrys of the Customs Union, the United States, Japan and Erope in 2005-2015. As the research tool used as the simplest methods of transformation, graphical and tabular visualization of data, and methods of statistical analysis. In particular, based on a specialized software package (SPSS, the main components method, discriminant analysis, hierarchical methods of cluster analysis, Ward’s method and k-means were applied.The application of the method of principal components to the initial data makes it possible to substantially and effectively reduce the initial space of industrial production data. Thus, for example, in analyzing the structure of industrial production, the reduction was from fifteen industries to three basic, well-interpreted factors: the relatively extractive industries (with a low degree of processing, high-tech industries and consumer goods (medium-technology sectors. At the same time, as a result of comparison of the results of application of cluster analysis to the initial data and data obtained on the basis of the principal components method, it was established that clustering industrial production data on the basis of new factors significantly improves the results of clustering.As a result of analyzing the parameters of
STRATEGIES FOR DEVELOPING SUSTAINABLE AND COMPETITIVE CLUSTER FOR SHRIMP INDUSTRY

Directory of Open Access Journals (Sweden)

Anas M. Fauzi

2012-09-01

Full Text Available Kampung Vannamei as shrimp cluster is being developed since 2004 by PT CP Prima, tbk Surabaya through Shrimp Culture Health Management transformation technology to several traditional farmers in Gresik, Lamongan, Tuban, and Madura areas. The research objectives aims to identify and mapping of stakeholder, to analyze interaction of stakeholders, to formulate strategy from internal and external environment factors and to set priority on strategy to develop sustainable and competitive shrimp cluster in the Kampung vannamei. Primary data was collected through stakeholders’ discussion forums, questionnaires, and interviews with relevant actors. Observations to the business unit also performed to determine the production and business conditions, particularly in capturing information about the threat and challenges. While the secondary data is used in policy documents national and local area statistics, and relevant literature. Analyses were performed by using the SRI International cluster pyramid, diamond porter’s analysis, SWOT and Matrix TOWS analysis, and analytical hierarchy process. Analyses were performed by the methods discussed in qualitative and descriptive. There are 7 strategies could be implemented to develop sustainable and competitive shrimp cluster. However, it is recommended to implement the strategy base on priority, which the first priority is strategy to improve linkages between businesses in the upstream and downstream industries into multi stakeholders’ platform in shrimp industry.Keywords: Shrimp, Cluster, Competitiveness, Diamond Porter, SWOT Analysis, AHP
ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.

Science.gov (United States)

Giancarlo, R; Scaturro, D; Utro, F

2015-02-01

The prediction of the number of clusters in a dataset, in particular microarrays, is a fundamental task in biological data analysis, usually performed via validation measures. Unfortunately, it has received very little attention and in fact there is a growing need for software tools/libraries dedicated to it. Here we present ValWorkBench, a software library consisting of eleven well known validation measures, together with novel heuristic approximations for some of them. The main objective of this paper is to provide the interested researcher with the full software documentation of an open source cluster validation platform having the main features of being easily extendible in a homogeneous way and of offering software components that can be readily re-used. Consequently, the focus of the presentation is on the architecture of the library, since it provides an essential map that can be used to access the full software documentation, which is available at the supplementary material website [1]. The mentioned main features of ValWorkBench are also discussed and exemplified, with emphasis on software abstraction design and re-usability. A comparison with existing cluster validation software libraries, mainly in terms of the mentioned features, is also offered. It suggests that ValWorkBench is a much needed contribution to the microarray software development/algorithm engineering community. For completeness, it is important to mention that previous accurate algorithmic experimental analysis of the relative merits of each of the implemented measures [19,23,25], carried out specifically on microarray data, gives useful insights on the effectiveness of ValWorkBench for cluster validation to researchers in the microarray community interested in its use for the mentioned task. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Genome-scale analysis of positional clustering of mouse testis-specific genes

Directory of Open Access Journals (Sweden)

Lee Bernett TK

2005-01-01

Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.
Pattern recognition in menstrual bleeding diaries by statistical cluster analysis

Directory of Open Access Journals (Sweden)

Wessel Jens

2009-07-01

Full Text Available Abstract Background The aim of this paper is to empirically identify a treatment-independent statistical method to describe clinically relevant bleeding patterns by using bleeding diaries of clinical studies on various sex hormone containing drugs. Methods We used the four cluster analysis methods single, average and complete linkage as well as the method of Ward for the pattern recognition in menstrual bleeding diaries. The optimal number of clusters was determined using the semi-partial R2, the cubic cluster criterion, the pseudo-F- and the pseudo-t2-statistic. Finally, the interpretability of the results from a gynecological point of view was assessed. Results The method of Ward yielded distinct clusters of the bleeding diaries. The other methods successively chained the observations into one cluster. The optimal number of distinctive bleeding patterns was six. We found two desirable and four undesirable bleeding patterns. Cyclic and non cyclic bleeding patterns were well separated. Conclusion Using this cluster analysis with the method of Ward medications and devices having an impact on bleeding can be easily compared and categorized.
Comparative analysis of clustering methods for gene expression time course data

Directory of Open Access Journals (Sweden)

Ivan G. Costa

2004-01-01

Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.
Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures

Science.gov (United States)

Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.

2016-12-01

The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.
Universal platform for quantitative analysis of DNA transposition

Directory of Open Access Journals (Sweden)

Pajunen Maria I

2010-11-01

Full Text Available Abstract Background Completed genome projects have revealed an astonishing diversity of transposable genetic elements, implying the existence of novel element families yet to be discovered from diverse life forms. Concurrently, several better understood transposon systems have been exploited as efficient tools in molecular biology and genomics applications. Characterization of new mobile elements and improvement of the existing transposition technology platforms warrant easy-to-use assays for the quantitative analysis of DNA transposition. Results Here we developed a universal in vivo platform for the analysis of transposition frequency with class II mobile elements, i.e., DNA transposons. For each particular transposon system, cloning of the transposon ends and the cognate transposase gene, in three consecutive steps, generates a multifunctional plasmid, which drives inducible expression of the transposase gene and includes a mobilisable lacZ-containing reporter transposon. The assay scores transposition events as blue microcolonies, papillae, growing within otherwise whitish Escherichia coli colonies on indicator plates. We developed the assay using phage Mu transposition as a test model and validated the platform using various MuA transposase mutants. For further validation and to illustrate universality, we introduced IS903 transposition system components into the assay. The developed assay is adjustable to a desired level of initial transposition via the control of a plasmid-borne E. coli arabinose promoter. In practice, the transposition frequency is modulated by varying the concentration of arabinose or glucose in the growth medium. We show that variable levels of transpositional activity can be analysed, thus enabling straightforward screens for hyper- or hypoactive transposase mutants, regardless of the original wild-type activity level. Conclusions The established universal papillation assay platform should be widely applicable to a
The Productivity Analysis of Chennai Automotive Industry Cluster

Science.gov (United States)

Bhaskaran, E.

2014-07-01

Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.
MMPI profiles of males accused of severe crimes: a cluster analysis

NARCIS (Netherlands)

Spaans, M.; Barendregt, M.; Muller, E.; Beurs, E. de; Nijman, H.L.I.; Rinne, T.

2009-01-01

In studies attempting to classify criminal offenders by cluster analysis of Minnesota Multiphasic Personality Inventory-2 (MMPI-2) data, the number of clusters found varied between 10 (the Megargee System) and two (one cluster indicating no psychopathology and one exhibiting serious
A DESIGN OF SOCIAL MEDIA ANALYSIS SYSTEM BASED ON MOBILE PLATFORMS

OpenAIRE

Alaybeyoglu, Aysegul; Yavuz, Levent

2017-01-01

Alongwith the developing technology, social media technologies have becomewidespread and the number of internet users has increased rapidly. In addition,social media platforms have become very popular and the number of active socialmedia users has increased considerably. As a result of the increased use ofsocial media, there has been a trend towards mobile platforms. In this paper, adesign of a social media analysis system is developed using mobile platformsbased on Android. By this way, impo...

Demand Analysis of Logistics Information Matching Platform: A Survey from Highway Freight Market in Zhejiang Province

Science.gov (United States)

Chen, Daqiang; Shen, Xiahong; Tong, Bing; Zhu, Xiaoxiao; Feng, Tao

With the increasing competition in logistics industry and promotion of lower logistics costs requirements, the construction of logistics information matching platform for highway transportation plays an important role, and the accuracy of platform design is the key to successful operation or not. Based on survey results of logistics service providers, customers and regulation authorities to access to information and in-depth information demand analysis of logistics information matching platform for highway transportation in Zhejiang province, a survey analysis for framework of logistics information matching platform for highway transportation is provided.
ANALYSIS OF DEVELOPING BATIK INDUSTRY CLUSTER IN BAKARAN VILLAGE CENTRAL JAVA PROVINCE

Directory of Open Access Journals (Sweden)

Hermanto Hermanto

2017-06-01

Full Text Available SMEs grow in a cluster in a certain geographical area. The entrepreneurs grow and thrive through the business cluster. Central Java Province has a lot of business clusters in improving the regional economy, one of which is batik industry cluster. Pati Regency is one of regencies / city in Central Java that has the lowest turnover. Batik industy cluster in Pati develops quite well, which can be seen from the increasing number of batik industry incorporated in the cluster. This research examines the strategy of developing the batik industry cluster in Pati Regency. The purpose of this research is to determine the proper strategy for developing the batik industry clusters in Pati. The method of research is quantitative. The analysis tool of this research is the Strengths, Weakness, Opportunity, Threats (SWOT analysis. The result of SWOT analysis in this research shows that the proper strategy for developing the batik industry cluster in Pati is optimizing the management of batik business cluster in Bakaran Village; the local government provides information of the facility of business capital loans; the utilization of labors from Bakaran Village while improving the quality of labors by training, and marketing the Bakaran batik to the broader markets while maintaining the quality of batik. Advice that can be given from this research is that the parties who have a role in batik industry cluster development in Bakaran Village, Pati Regency, such as the Local Government.
Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

Science.gov (United States)

Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

2017-07-01

Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.
A SURVEY ON DOCUMENT CLUSTERING APPROACH FOR COMPUTER FORENSIC ANALYSIS

OpenAIRE

Monika Raghuvanshi*, Rahul Patel

2016-01-01

In a forensic analysis, large numbers of files are examined. Much of the information comprises of in unstructured format, so it’s quite difficult task for computer forensic to perform such analysis. That’s why to do the forensic analysis of document within a limited period of time require a special approach such as document clustering. This paper review different document clustering algorithms methodologies for example K-mean, K-medoid, single link, complete link, average link in accorandance...
Cluster Analysis in Rapeseed (Brassica Napus L.)

International Nuclear Information System (INIS)

Mahasi, J.M

2002-01-01

With widening edible deficit, Kenya has become increasingly dependent on imported edible oils. Many oilseed crops (e.g. sunflower, soya beans, rapeseed/mustard, sesame, groundnuts etc) can be grown in Kenya. But oilseed rape is preferred because it very high yielding (1.5 tons-4.0 tons/ha) with oil content of 42-46%. Other uses include fitting in various cropping systems as; relay/inter crops, rotational crops, trap crops and fodder. It is soft seeded hence oil extraction is relatively easy. The meal is high in protein and very useful in livestock supplementation. Rapeseed can be straight combined using adjusted wheat combines. The priority is to expand domestic oilseed production, hence the need to introduce improved rapeseed germplasm from other countries. The success of any crop improvement programme depends on the extent of genetic diversity in the material. Hence, it is essential to understand the adaptation of introduced genotypes and the similarities if any among them. Evaluation trials were carried out on 17 rapeseed genotypes (nine Canadian origin and eight of European origin) grown at 4 locations namely Endebess, Njoro, Timau and Mau Narok in three years (1992, 1993 and 1994). Results for 1993 were discarded due to severe drought. An analysis of variance was carried out only on seed yields and the treatments were found to be significantly different. Cluster analysis was then carried out on mean seed yields and based on this analysis; only one major group exists within the material. In 1992, varieties 2,3,8 and 9 didn't fall in the same cluster as the rest. Variety 8 was the only one not classified with the rest of the Canadian varieties. Three European varieties (2,3 and 9) were however not classified with the others. In 1994, varieties 10 and 6 didn't fall in the major cluster. Of these two, variety 10 is of Canadian origin. Varieties were more similar in 1994 than 1992 due to favorable weather. It is evident that, genotypes from different geographical
Emulation Platform for Cyber Analysis of Wireless Communication Network Protocols

Energy Technology Data Exchange (ETDEWEB)

Van Leeuwen, Brian P. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Eldridge, John M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-11-01

Wireless networking and mobile communications is increasing around the world and in all sectors of our lives. With increasing use, the density and complexity of the systems increase with more base stations and advanced protocols to enable higher data throughputs. The security of data transported over wireless networks must also evolve with the advances in technologies enabling more capable wireless networks. However, means for analysis of the effectiveness of security approaches and implementations used on wireless networks are lacking. More specifically a capability to analyze the lower-layer protocols (i.e., Link and Physical layers) is a major challenge. An analysis approach that incorporates protocol implementations without the need for RF emissions is necessary. In this research paper several emulation tools and custom extensions that enable an analysis platform to perform cyber security analysis of lower layer wireless networks is presented. A use case of a published exploit in the 802.11 (i.e., WiFi) protocol family is provided to demonstrate the effectiveness of the described emulation platform.
The Quantitative Analysis of Chennai Automotive Industry Cluster

Science.gov (United States)

Bhaskaran, Ethirajan

2016-07-01

Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity
Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster

Energy Technology Data Exchange (ETDEWEB)

Allada, Veerendra, Benjegerdes, Troy; Bode, Brett

2009-08-31

Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as the workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.
Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster

International Nuclear Information System (INIS)

Allada, Veerendra; Benjegerdes, Troy; Bode, Brett

2009-01-01

Commodity clusters augmented with application accelerators are evolving as competitive high performance computing systems. The Graphical Processing Unit (GPU) with a very high arithmetic density and performance per price ratio is a good platform for the scientific application acceleration. In addition to the interconnect bottlenecks among the cluster compute nodes, the cost of memory copies between the host and the GPU device have to be carefully amortized to improve the overall efficiency of the application. Scientific applications also rely on efficient implementation of the BAsic Linear Algebra Subroutines (BLAS), among which the General Matrix Multiply (GEMM) is considered as the workhorse subroutine. In this paper, they study the performance of the memory copies and GEMM subroutines that are critical to port the computational chemistry algorithms to the GPU clusters. To that end, a benchmark based on the NetPIPE framework is developed to evaluate the latency and bandwidth of the memory copies between the host and the GPU device. The performance of the single and double precision GEMM subroutines from the NVIDIA CUBLAS 2.0 library are studied. The results have been compared with that of the BLAS routines from the Intel Math Kernel Library (MKL) to understand the computational trade-offs. The test bed is a Intel Xeon cluster equipped with NVIDIA Tesla GPUs.
Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability.

Science.gov (United States)

Miller, Christopher B; Bartlett, Delwyn J; Mullins, Anna E; Dodds, Kirsty L; Gordon, Christopher J; Kyle, Simon D; Kim, Jong Won; D'Rozario, Angela L; Lee, Rico S C; Comas, Maria; Marshall, Nathaniel S; Yee, Brendon J; Espie, Colin A; Grunstein, Ronald R

2016-11-01

To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative ( q )-EEG and heart rate variability (HRV). Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q -EEG. Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. © 2016 Associated Professional Sleep Societies, LLC.
Clusters of Insomnia Disorder: An Exploratory Cluster Analysis of Objective Sleep Parameters Reveals Differences in Neurocognitive Functioning, Quantitative EEG, and Heart Rate Variability

Science.gov (United States)

Miller, Christopher B.; Bartlett, Delwyn J.; Mullins, Anna E.; Dodds, Kirsty L.; Gordon, Christopher J.; Kyle, Simon D.; Kim, Jong Won; D'Rozario, Angela L.; Lee, Rico S.C.; Comas, Maria; Marshall, Nathaniel S.; Yee, Brendon J.; Espie, Colin A.; Grunstein, Ronald R.

2016-01-01

Study Objectives: To empirically derive and evaluate potential clusters of Insomnia Disorder through cluster analysis from polysomnography (PSG). We hypothesized that clusters would differ on neurocognitive performance, sleep-onset measures of quantitative (q)-EEG and heart rate variability (HRV). Methods: Research volunteers with Insomnia Disorder (DSM-5) completed a neurocognitive assessment and overnight PSG measures of total sleep time (TST), wake time after sleep onset (WASO), and sleep onset latency (SOL) were used to determine clusters. Results: From 96 volunteers with Insomnia Disorder, cluster analysis derived at least two clusters from objective sleep parameters: Insomnia with normal objective sleep duration (I-NSD: n = 53) and Insomnia with short sleep duration (I-SSD: n = 43). At sleep onset, differences in HRV between I-NSD and I-SSD clusters suggest attenuated parasympathetic activity in I-SSD (P insomnia clusters derived from cluster analysis differ in sleep onset HRV. Preliminary data suggest evidence for three clusters in insomnia with differences for sustained attention and sleep-onset q-EEG. Clinical Trial Registration: Insomnia 100 sleep study: Australia New Zealand Clinical Trials Registry (ANZCTR) identification number 12612000049875. URL: https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=347742. Citation: Miller CB, Bartlett DJ, Mullins AE, Dodds KL, Gordon CJ, Kyle SD, Kim JW, D'Rozario AL, Lee RS, Comas M, Marshall NS, Yee BJ, Espie CA, Grunstein RR. Clusters of Insomnia Disorder: an exploratory cluster analysis of objective sleep parameters reveals differences in neurocognitive functioning, quantitative EEG, and heart rate variability. SLEEP 2016;39(11):1993–2004. PMID:27568796
Assessment of genetic divergence in tomato through agglomerative hierarchical clustering and principal component analysis

International Nuclear Information System (INIS)

Iqbal, Q.; Saleem, M.Y.; Hameed, A.; Asghar, M.

2014-01-01

For the improvement of qualitative and quantitative traits, existence of variability has prime importance in plant breeding. Data on different morphological and reproductive traits of 47 tomato genotypes were analyzed for correlation,agglomerative hierarchical clustering and principal component analysis (PCA) to select genotypes and traits for future breeding program. Correlation analysis revealed significant positive association between yield and yield components like fruit diameter, single fruit weight and number of fruits plant-1. Principal component (PC) analysis depicted first three PCs with Eigen-value higher than 1 contributing 81.72% of total variability for different traits. The PC-I showed positive factor loadings for all the traits except number of fruits plant-1. The contribution of single fruit weight and fruit diameter was highest in PC-1. Cluster analysis grouped all genotypes into five divergent clusters. The genotypes in cluster-II and cluster-V exhibited uniform maturity and higher yield. The D2 statistics confirmed highest distance between cluster- III and cluster-V while maximum similarity was observed in cluster-II and cluster-III. It is therefore suggested that crosses between genotypes of cluster-II and cluster-V with those of cluster-I and cluster-III may exhibit heterosis in F1 for hybrid breeding and for selection of superior genotypes in succeeding generations for cross breeding programme. (author)
A Distributed Flocking Approach for Information Stream Clustering Analysis

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL

2006-01-01

Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.
YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

Science.gov (United States)

Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

2015-01-16

Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the
Cluster analysis of obesity and asthma phenotypes.

Directory of Open Access Journals (Sweden)

E Rand Sutherland

Full Text Available Asthma is a heterogeneous disease with variability among patients in characteristics such as lung function, symptoms and control, body weight, markers of inflammation, and responsiveness to glucocorticoids (GC. Cluster analysis of well-characterized cohorts can advance understanding of disease subgroups in asthma and point to unsuspected disease mechanisms. We utilized an hypothesis-free cluster analytical approach to define the contribution of obesity and related variables to asthma phenotype.In a cohort of clinical trial participants (n = 250, minimum-variance hierarchical clustering was used to identify clinical and inflammatory biomarkers important in determining disease cluster membership in mild and moderate persistent asthmatics. In a subset of participants, GC sensitivity was assessed via expression of GC receptor alpha (GCRα and induction of MAP kinase phosphatase-1 (MKP-1 expression by dexamethasone. Four asthma clusters were identified, with body mass index (BMI, kg/m(2 and severity of asthma symptoms (AEQ score the most significant determinants of cluster membership (F = 57.1, p<0.0001 and F = 44.8, p<0.0001, respectively. Two clusters were composed of predominantly obese individuals; these two obese asthma clusters differed from one another with regard to age of asthma onset, measures of asthma symptoms (AEQ and control (ACQ, exhaled nitric oxide concentration (F(ENO and airway hyperresponsiveness (methacholine PC(20 but were similar with regard to measures of lung function (FEV(1 (% and FEV(1/FVC, airway eosinophilia, IgE, leptin, adiponectin and C-reactive protein (hsCRP. Members of obese clusters demonstrated evidence of reduced expression of GCRα, a finding which was correlated with a reduced induction of MKP-1 expression by dexamethasoneObesity is an important determinant of asthma phenotype in adults. There is heterogeneity in expression of clinical and inflammatory biomarkers of asthma across obese individuals
Cluster: A New Application for Spatial Analysis of Pixelated Data for Epiphytotics.

Science.gov (United States)

Nelson, Scot C; Corcoja, Iulian; Pethybridge, Sarah J

2017-12-01

Spatial analysis of epiphytotics is essential to develop and test hypotheses about pathogen ecology, disease dynamics, and to optimize plant disease management strategies. Data collection for spatial analysis requires substantial investment in time to depict patterns in various frames and hierarchies. We developed a new approach for spatial analysis of pixelated data in digital imagery and incorporated the method in a stand-alone desktop application called Cluster. The user isolates target entities (clusters) by designating up to 24 pixel colors as nontargets and moves a threshold slider to visualize the targets. The app calculates the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and computes the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters manually and/or automatically by specifying a size threshold value to exclude smaller targets from the analysis. Up to 1,000 stochastic simulations randomly place the centroids of each cluster in ranked order of size (largest to smallest) within each matrix while preserving their calculated angles of orientation for the long axes. A two-tailed probability t test compares the mean inter-cluster distances for the observed versus the values derived from randomly simulated maps. This is the basis for statistical testing of the null hypothesis that the clusters are randomly distributed within the frame of interest. These frames can assume any shape, from natural (e.g., leaf) to arbitrary (e.g., a rectangular or polygonal field). Cluster summarizes normalized attributes of clusters, including pixel number, axis length, axis width, compass orientation, and the length/width ratio, available to the user as a downloadable spreadsheet. Each simulated map may be saved as an image and inspected. Provided examples demonstrate the utility of Cluster to analyze patterns at various spatial scales in plant pathology and ecology and highlight the
Cluster analysis as a prediction tool for pregnancy outcomes.

Science.gov (United States)

Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

2015-03-01

Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.
Phenotypes of asthma in low-income children and adolescents: cluster analysis

Directory of Open Access Journals (Sweden)

Anna Lucia Barros Cabral

Full Text Available ABSTRACT Objective: Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. Methods: We included 306 children and adolescents (6-18 years of age with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Results: Three clusters were defined for our population. Cluster 1 (n = 94 included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87 included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108 included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Conclusions: Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability.
AZTLAN platform: Mexican platform for analysis and design of nuclear reactors; AZTLAN platform: plataforma mexicana para el analisis y diseno de reactores nucleares

Energy Technology Data Exchange (ETDEWEB)

Gomez T, A. M.; Puente E, F. [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico); Del Valle G, E. [IPN, Escuela Superior de Fisica y Matematicas, Av. IPN s/n, Edif. 9, Col. San Pedro Zacatenco, 07738 Mexico D. F. (Mexico); Francois L, J. L.; Martin del Campo M, C. [UNAM, Facultad de Ingenieria, Departamento de Sistemas Energeticos, Paseo Cuauhnahuac 8532, 62550 Jiutepec, Morelos (Mexico); Espinosa P, G., E-mail: armando.gomez@inin.gob.mx [Universidad Autonoma Metropolitana, Unidad Iztapalapa, Av. San Rafael Atlixco No. 186, Col. Vicentina, 09340 Mexico D. F. (Mexico)

2014-10-15

The Aztlan platform Project is a national initiative led by the Instituto Nacional de Investigaciones Nucleares (ININ) which brings together the main public houses of higher studies in Mexico, such as: Instituto Politecnico Nacional, Universidad Nacional Autonoma de Mexico and Universidad Autonoma Metropolitana in an effort to take a significant step toward the calculation autonomy and analysis that seeks to place Mexico in the medium term in a competitive international level on software issues for analysis of nuclear reactors. This project aims to modernize, improve and integrate the neutron, thermal-hydraulic and thermo-mechanical codes, developed in Mexican institutions, within an integrated platform, developed and maintained by Mexican experts to benefit from the same institutions. This project is financed by the mixed fund SENER-CONACYT of Energy Sustain ability, and aims to strengthen substantially to research institutions, such as educational institutions contributing to the formation of highly qualified human resources in the area of analysis and design of nuclear reactors. As innovative part the project includes the creation of a user group, made up of members of the project institutions as well as the Comision Nacional de Seguridad Nuclear y Salvaguardias, Central Nucleoelectrica de Laguna Verde (CNLV), Secretaria de Energia (Mexico) and Karlsruhe Institute of Technology (Germany) among others. This user group will be responsible for using the software and provide feedback to the development equipment in order that progress meets the needs of the regulator and industry; in this case the CNLV. Finally, in order to bridge the gap between similar developments globally, they will make use of the latest super computing technology to speed up calculation times. This work intends to present to national nuclear community the project, so a description of the proposed methodology is given, as well as the goals and objectives to be pursued for the development of the
Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

Science.gov (United States)

Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

2018-04-01

Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).

Identifying novel phenotypes of acute heart failure using cluster analysis of clinical variables.

Science.gov (United States)

Horiuchi, Yu; Tanimoto, Shuzou; Latif, A H M Mahbub; Urayama, Kevin Y; Aoki, Jiro; Yahagi, Kazuyuki; Okuno, Taishi; Sato, Yu; Tanaka, Tetsu; Koseki, Keita; Komiyama, Kota; Nakajima, Hiroyoshi; Hara, Kazuhiro; Tanabe, Kengo

2018-07-01

Acute heart failure (AHF) is a heterogeneous disease caused by various cardiovascular (CV) pathophysiology and multiple non-CV comorbidities. We aimed to identify clinically important subgroups to improve our understanding of the pathophysiology of AHF and inform clinical decision-making. We evaluated detailed clinical data of 345 consecutive AHF patients using non-hierarchical cluster analysis of 77 variables, including age, sex, HF etiology, comorbidities, physical findings, laboratory data, electrocardiogram, echocardiogram and treatment during hospitalization. Cox proportional hazards regression analysis was performed to estimate the association between the clusters and clinical outcomes. Three clusters were identified. Cluster 1 (n=108) represented "vascular failure". This cluster had the highest average systolic blood pressure at admission and lung congestion with type 2 respiratory failure. Cluster 2 (n=89) represented "cardiac and renal failure". They had the lowest ejection fraction (EF) and worst renal function. Cluster 3 (n=148) comprised mostly older patients and had the highest prevalence of atrial fibrillation and preserved EF. Death or HF hospitalization within 12-month occurred in 23% of Cluster 1, 36% of Cluster 2 and 36% of Cluster 3 (p=0.034). Compared with Cluster 1, risk of death or HF hospitalization was 1.74 (95% CI, 1.03-2.95, p=0.037) for Cluster 2 and 1.82 (95% CI, 1.13-2.93, p=0.014) for Cluster 3. Cluster analysis may be effective in producing clinically relevant categories of AHF, and may suggest underlying pathophysiology and potential utility in predicting clinical outcomes. Copyright © 2018 Elsevier B.V. All rights reserved.
Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

Science.gov (United States)

Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

2017-10-01

Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Comparison of population-averaged and cluster-specific models for the analysis of cluster randomized trials with missing binary outcomes: a simulation study

Directory of Open Access Journals (Sweden)

Ma Jinhui

2013-01-01

Full Text Available Abstracts Background The objective of this simulation study is to compare the accuracy and efficiency of population-averaged (i.e. generalized estimating equations (GEE and cluster-specific (i.e. random-effects logistic regression (RELR models for analyzing data from cluster randomized trials (CRTs with missing binary responses. Methods In this simulation study, clustered responses were generated from a beta-binomial distribution. The number of clusters per trial arm, the number of subjects per cluster, intra-cluster correlation coefficient, and the percentage of missing data were allowed to vary. Under the assumption of covariate dependent missingness, missing outcomes were handled by complete case analysis, standard multiple imputation (MI and within-cluster MI strategies. Data were analyzed using GEE and RELR. Performance of the methods was assessed using standardized bias, empirical standard error, root mean squared error (RMSE, and coverage probability. Results GEE performs well on all four measures — provided the downward bias of the standard error (when the number of clusters per arm is small is adjusted appropriately — under the following scenarios: complete case analysis for CRTs with a small amount of missing data; standard MI for CRTs with variance inflation factor (VIF 50. RELR performs well only when a small amount of data was missing, and complete case analysis was applied. Conclusion GEE performs well as long as appropriate missing data strategies are adopted based on the design of CRTs and the percentage of missing data. In contrast, RELR does not perform well when either standard or within-cluster MI strategy is applied prior to the analysis.
Cluster analysis of radionuclide concentrations in beach sand

NARCIS (Netherlands)

de Meijer, R.J.; James, I.; Jennings, P.J.; Keoyers, J.E.

This paper presents a method in which natural radionuclide concentrations of beach sand minerals are traced along a stretch of coast by cluster analysis. This analysis yields two groups of mineral deposit with different origins. The method deviates from standard methods of following dispersal of
BlockSci: Design and applications of a blockchain analysis platform

OpenAIRE

Kalodner, Harry; Goldfeder, Steven; Chator, Alishah; Möser, Malte; Narayanan, Arvind

2017-01-01

Analysis of blockchain data is useful for both scientific research and commercial applications. We present BlockSci, an open-source software platform for blockchain analysis. BlockSci is versatile in its support for different blockchains and analysis tasks. It incorporates an in-memory, analytical (rather than transactional) database, making it several hundred times faster than existing tools. We describe BlockSci's design and present four analyses that illustrate its capabilities. This is a ...
Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

Science.gov (United States)

Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

2008-06-18

Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson
GLOBULAR CLUSTER ABUNDANCES FROM HIGH-RESOLUTION, INTEGRATED-LIGHT SPECTROSCOPY. II. EXPANDING THE METALLICITY RANGE FOR OLD CLUSTERS AND UPDATED ANALYSIS TECHNIQUES

Energy Technology Data Exchange (ETDEWEB)

Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew [The Observatories of the Carnegie Institution for Science, 813 Santa Barbara St., Pasadena, CA 91101 (United States)

2017-01-10

We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ∼0.1 dex for GCs with metallicities as high as [Fe/H] = −0.3, but the abundances measured for more metal-rich clusters may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na i, Mg i, Al i, Si i, Ca i, Ti i, Ti ii, Sc ii, V i, Cr i, Mn i, Co i, Ni i, Cu i, Y ii, Zr i, Ba ii, La ii, Nd ii, and Eu ii. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe i, Ca i, Si i, Ni i, and Ba ii. The elements that show the greatest differences include Mg i and Zr i. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements.
A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

Directory of Open Access Journals (Sweden)

Shaoning Li

2017-01-01

Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.
Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling

Science.gov (United States)

Medina, Ignacio; Carbonell, José; Pulido, Luis; Madeira, Sara C.; Goetz, Stefan; Conesa, Ana; Tárraga, Joaquín; Pascual-Montano, Alberto; Nogales-Cadenas, Ruben; Santoyo, Javier; García, Francisco; Marbà, Martina; Montaner, David; Dopazo, Joaquín

2010-01-01

Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein–protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org. PMID:20478823
Clustering of users of digital libraries through log file analysis

Directory of Open Access Journals (Sweden)

Juan Antonio Martínez-Comeche

2017-09-01

Full Text Available This study analyzes how users perform information retrieval tasks when introducing queries to the Hispanic Digital Library. Clusters of users are differentiated based on their distinct information behavior. The study used the log files collected by the server over a year and different possible clustering algorithms are compared. The k-means algorithm is found to be a suitable clustering method for the analysis of large log files from digital libraries. In the case of the Hispanic Digital Library the results show three clusters of users and the characteristic information behavior of each group is described.
Enforcing Resource Sharing Agreements Among Distributed Server Clusters

National Research Council Canada - National Science Library

Zhao, Tao; Karamcheti, Vijay

2001-01-01

Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed...
Feasibility study of BES data processing and physics analysis on a PC/Linux platform

International Nuclear Information System (INIS)

Rong Gang; He Kanglin; Zhao Jiawei; Heng Yuekun; Zhang Chun

1999-01-01

The authors report a feasibility study of off-line BES data processing (data reconstruction and Detector simulation) on a PC/Linux platform and an application of the PC/Linux system in D/Ds physics analysis. The authors compared the results obtained from the PC/Linux with that from HP workstation. It shows that PC/Linux platform can do BES data offline analysis as good as UNIX workstation do, but it is much powerful and economical
Feasibility Study of Parallel Finite Element Analysis on Cluster-of-Clusters

Science.gov (United States)

Muraoka, Masae; Okuda, Hiroshi

With the rapid growth of WAN infrastructure and development of Grid middleware, it's become a realistic and attractive methodology to connect cluster machines on wide-area network for the execution of computation-demanding applications. Many existing parallel finite element (FE) applications have been, however, designed and developed with a single computing resource in mind, since such applications require frequent synchronization and communication among processes. There have been few FE applications that can exploit the distributed environment so far. In this study, we explore the feasibility of FE applications on the cluster-of-clusters. First, we classify FE applications into two types, tightly coupled applications (TCA) and loosely coupled applications (LCA) based on their communication pattern. A prototype of each application is implemented on the cluster-of-clusters. We perform numerical experiments executing TCA and LCA on both the cluster-of-clusters and a single cluster. Thorough these experiments, by comparing the performances and communication cost in each case, we evaluate the feasibility of FEA on the cluster-of-clusters.
Evolutionary space platform concept study. Volume 2, part B: Manned space platform concepts

Science.gov (United States)

1982-01-01

Logical, cost-effective steps in the evolution of manned space platforms are investigated and assessed. Tasks included the analysis of requirements for a manned space platform, identifying alternative concepts, performing system analysis and definition of the concepts, comparing the concepts and performing programmatic analysis for a reference concept.
Full text clustering and relationship network analysis of biomedical publications.

Directory of Open Access Journals (Sweden)

Renchu Guan

Full Text Available Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.
Full text clustering and relationship network analysis of biomedical publications.

Science.gov (United States)

Guan, Renchu; Yang, Chen; Marchese, Maurizio; Liang, Yanchun; Shi, Xiaohu

2014-01-01

Rapid developments in the biomedical sciences have increased the demand for automatic clustering of biomedical publications. In contrast to current approaches to text clustering, which focus exclusively on the contents of abstracts, a novel method is proposed for clustering and analysis of complete biomedical article texts. To reduce dimensionality, Cosine Coefficient is used on a sub-space of only two vectors, instead of computing the Euclidean distance within the space of all vectors. Then a strategy and algorithm is introduced for Semi-supervised Affinity Propagation (SSAP) to improve analysis efficiency, using biomedical journal names as an evaluation background. Experimental results show that by avoiding high-dimensional sparse matrix computations, SSAP outperforms conventional k-means methods and improves upon the standard Affinity Propagation algorithm. In constructing a directed relationship network and distribution matrix for the clustering results, it can be noted that overlaps in scope and interests among BioMed publications can be easily identified, providing a valuable analytical tool for editors, authors and readers.
Steady state subchannel analysis of AHWR fuel cluster

International Nuclear Information System (INIS)

Dasgupta, A.; Chandraker, D.K.; Vijayan, P.K.; Saha, D.

2006-09-01

Subchannel analysis is a technique used to predict the thermal hydraulic behavior of reactor fuel assemblies. The rod cluster is subdivided into a number of parallel interacting flow subchannels. The conservation equations are solved for each of these subchannels, taking into account subchannel interactions. Subchannel analysis of AHWR D-5 fuel cluster has been carried out to determine the variations in thermal hydraulic conditions of coolant and fuel temperatures along the length of the fuel bundle. The hottest regions within the AHWR fuel bundle have been identified. The effect of creep on the fuel performance has also been studied. MCHFR has been calculated using Jansen-Levy correlation. The calculations have been backed by sensitivity analysis for parameters whose values are not known accurately. The sensitivity analysis showed the calculations to have a very low sensitivity to these parameters. Apart from the analysis, the report also includes a brief introduction of a few subchannel codes. A brief description of the equations and solution methodology used in COBRA-IIIC and COBRA-IV-I is also given. (author)
Mobility in Europe: Recent Trends from a Cluster Analysis

Directory of Open Access Journals (Sweden)

Ioana Manafi

2017-08-01

Full Text Available During the past decade, Europe was confronted with major changes and events offering large opportunities for mobility. The EU enlargement process, the EU policies regarding youth, the economic crisis affecting national economies on different levels, political instabilities in some European countries, high rates of unemployment or the increasing number of refugees are only a few of the factors influencing net migration in Europe. Based on a set of socio-economic indicators for EU/EFTA countries and cluster analysis, the paper provides an overview of regional differences across European countries, related to migration magnitude in the identified clusters. The obtained clusters are in accordance with previous studies in migration, and appear stable during the period of 2005-2013, with only some exceptions. The analysis revealed three country clusters: EU/EFTA center-receiving countries, EU/EFTA periphery-sending countries and EU/EFTA outlier countries, the names suggesting not only the geographical position within Europe, but the trends in net migration flows during the years. Therewith, the results provide evidence for the persistence of a movement from periphery to center countries, which is correlated with recent flows of mobility in Europe.
Cluster analysis for portfolio optimization

OpenAIRE

Vincenzo Tola; Fabrizio Lillo; Mauro Gallegati; Rosario N. Mantegna

2005-01-01

We consider the problem of the statistical uncertainty of the correlation matrix in the optimization of a financial portfolio. We show that the use of clustering algorithms can improve the reliability of the portfolio in terms of the ratio between predicted and realized risk. Bootstrap analysis indicates that this improvement is obtained in a wide range of the parameters N (number of assets) and T (investment horizon). The predicted and realized risk level and the relative portfolio compositi...
Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

Science.gov (United States)

Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

2014-04-29

To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

Science.gov (United States)

Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

2015-09-01

We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.
Analytics Platform for ATLAS Computing Services

CERN Document Server

Vukotic, Ilija; The ATLAS collaboration; Bryant, Lincoln

2016-01-01

Big Data technologies have proven to be very useful for storage, processing and visualization of derived metrics associated with ATLAS distributed computing (ADC) services. Log file data and database records, and metadata from a diversity of systems have been aggregated and indexed to create an analytics platform for ATLAS ADC operations analysis. Dashboards, wide area data access cost metrics, user analysis patterns, and resource utilization efficiency charts are produced flexibly through queries against a powerful analytics cluster. Here we explore whether these techniques and analytics ecosystem can be applied to add new modes of open, quick, and pervasive access to ATLAS event data so as to simplify access and broaden the reach of ATLAS public data to new communities of users. An ability to efficiently store, filter, search and deliver ATLAS data at the event and/or sub-event level in a widely supported format would enable or significantly simplify usage of machine learning tools like Spark, Jupyter, R, S...
A critical cluster analysis of 44 indicators of author-level performance

DEFF Research Database (Denmark)

Wildgaard, Lorna Elizabeth

2016-01-01

-four indicators of individual researcher performance were computed using the data. The clustering solution was supported by continued reference to the researcher’s curriculum vitae, an effect analysis and a risk analysis. Disciplinary appropriate indicators were identified and used to divide the researchers......This paper explores a 7-stage cluster methodology as a process to identify appropriate indicators for evaluation of individual researchers at a disciplinary and seniority level. Publication and citation data for 741 researchers from 4 disciplines was collected in Web of Science. Forty...... of statistics in research evaluation. The strength of the 7-stage cluster methodology is that it makes clear that in the evaluation of individual researchers, statistics cannot stand alone. The methodology is reliant on contextual information to verify the bibliometric values and cluster solution...
Tweets clustering using latent semantic analysis

Science.gov (United States)

Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

2017-04-01

Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.
Symptom Cluster Research With Biomarkers and Genetics Using Latent Class Analysis.

Science.gov (United States)

Conley, Samantha

2017-12-01

The purpose of this article is to provide an overview of latent class analysis (LCA) and examples from symptom cluster research that includes biomarkers and genetics. A review of LCA with genetics and biomarkers was conducted using Medline, Embase, PubMed, and Google Scholar. LCA is a robust latent variable model used to cluster categorical data and allows for the determination of empirically determined symptom clusters. Researchers should consider using LCA to link empirically determined symptom clusters to biomarkers and genetics to better understand the underlying etiology of symptom clusters. The full potential of LCA in symptom cluster research has not yet been realized because it has been used in limited populations, and researchers have explored limited biologic pathways.
The composite sequential clustering technique for analysis of multispectral scanner data

Science.gov (United States)

Su, M. Y.

1972-01-01

The clustering technique consists of two parts: (1) a sequential statistical clustering which is essentially a sequential variance analysis, and (2) a generalized K-means clustering. In this composite clustering technique, the output of (1) is a set of initial clusters which are input to (2) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum likelihood classification techniques. The mathematical algorithms for the composite sequential clustering program and a detailed computer program description with job setup are given.
Cluster-based analysis of multi-model climate ensembles

Science.gov (United States)

Hyde, Richard; Hossaini, Ryan; Leeson, Amber A.

2018-06-01

Clustering - the automated grouping of similar data - can provide powerful and unique insight into large and complex data sets, in a fast and computationally efficient manner. While clustering has been used in a variety of fields (from medical image processing to economics), its application within atmospheric science has been fairly limited to date, and the potential benefits of the application of advanced clustering techniques to climate data (both model output and observations) has yet to be fully realised. In this paper, we explore the specific application of clustering to a multi-model climate ensemble. We hypothesise that clustering techniques can provide (a) a flexible, data-driven method of testing model-observation agreement and (b) a mechanism with which to identify model development priorities. We focus our analysis on chemistry-climate model (CCM) output of tropospheric ozone - an important greenhouse gas - from the recent Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP). Tropospheric column ozone from the ACCMIP ensemble was clustered using the Data Density based Clustering (DDC) algorithm. We find that a multi-model mean (MMM) calculated using members of the most-populous cluster identified at each location offers a reduction of up to ˜ 20 % in the global absolute mean bias between the MMM and an observed satellite-based tropospheric ozone climatology, with respect to a simple, all-model MMM. On a spatial basis, the bias is reduced at ˜ 62 % of all locations, with the largest bias reductions occurring in the Northern Hemisphere - where ozone concentrations are relatively large. However, the bias is unchanged at 9 % of all locations and increases at 29 %, particularly in the Southern Hemisphere. The latter demonstrates that although cluster-based subsampling acts to remove outlier model data, such data may in fact be closer to observed values in some locations. We further demonstrate that clustering can provide a viable and
Modeling Transfer of Knowledge in an Online Platform of a Cluster

OpenAIRE

Schmidt, Danilo Marcello; Böttcher, Lena; Wilberg, Julian; Kammerl, Daniel; Lindemann, Udo

2016-01-01

Dealing with knowledge as a relevant resource and factor for production has become increasingly important in the course of globalization. This work focuses on questions about transferring knowledge when many companies work together in a cluster of enterprises. We developed a model of this transfer based on the theory of clusters from the New Institutional Economics’ point of view and based on existing theories about knowledge and knowledge transfer. This theoretical construct is evaluated and...
Porting of serial molecular dynamics code on MIMD platforms

International Nuclear Information System (INIS)

Celino, M.

1995-05-01

A molecular Dynamics (MD) code, utilized for the study of atomistic models of metallic systems has been parallelized for MIMD (Multiple Instructions Multiple Data) parallel platforms by means of the Parallel Virtual Machine (PVM) message passing library. Since the parallelization implies modifications of the sequential algorithms, these are described from the point of view of the Statistical Mechanics theory. Furthermore, techniques and parallelization strategies utilized and the MD parallel code are described in detail. Benchmarks on several MIMD platforms (IBM SP1 and SP2, Cray T3D, Cluster of workstations) allow performances evaluation of the code versus the different characteristics of the parallel platforms
Micromagnetics on high-performance workstation and mobile computational platforms

Science.gov (United States)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
Clusters of galaxies as tools in observational cosmology : results from x-ray analysis

International Nuclear Information System (INIS)

Weratschnig, J.M.

2009-01-01

Clusters of galaxies are the largest gravitationally bound structures in the universe. They can be used as ideal tools to study large scale structure formation (e.g. when studying merger clusters) and provide highly interesting environments to analyse several characteristic interaction processes (like ram pressure stripping of galaxies, magnetic fields). In this dissertation thesis, we have studied several clusters of galaxies using X-ray observations. To obtain scientific results, we have applied different data reduction and analysis methods. With a combination of morphological and spectral analysis, the merger cluster Abell 514 was studied in much detail. It has a highly interesting morphology and shows signs for an ongoing merger as well as a shock. using a new method to detect substructure, we have analysed several clusters to determine whether any substructure is present in the X-ray image. This hints towards a real structure in the distribution of the intra-cluster medium (ICM) and is evidence for ongoing mergers. The results from this analysis are extensively used with the cluster of galaxies Abell S1136. Here, we study the ICM distribution and compare its structure with the spatial distribution of star forming galaxies. Cluster magnetic fields are another important topic of my thesis. They can be studied in Radio observations, which can be put into relation with results from X-ray observations. using observational data from several clusters, we could support the theory that cluster magnetic fields are frozen into the ICM. (author)
Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

Directory of Open Access Journals (Sweden)

Marco Borri

Full Text Available To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment.The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4. Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters.The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4, determined with cluster validation, produced the best separation between reducing and non-reducing clusters.The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
A Multi-Science Data Analysis Platform and the GeneROOT Use Case

CERN Multimedia

CERN. Geneva; Rademakers, Fons

2017-01-01

This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.
Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

Directory of Open Access Journals (Sweden)

Jeban Ganesalingam

2009-09-01

Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.
3D FEM Analysis of a Pile-Supported Riverine Platform under Environmental Loads Incorporating Soil-Pile Interaction

Directory of Open Access Journals (Sweden)

Denise-Penelope N. Kontoni

2018-01-01

Full Text Available An existing riverine platform in Egypt, together with its pile group foundation, is analyzed under environmental loads using 3D FEM structural analysis software incorporating soil-pile interaction. The interaction between the transfer plate and the piles supporting the platform is investigated. Two connection conditions were studied assuming fixed or hinged connection between the piles and the reinforced concrete platform for the purpose of comparison of the structural behavior. The analysis showed that the fixed or hinged connection condition between the piles and the platform altered the values and distribution of displacements, normal force, bending moments, and shear forces along the length of each pile. The distribution of piles in the pile group affects the stress distribution on both the soil and platform. The piles were found to suffer from displacement failure rather than force failure. Moreover, the resulting bending stresses on the reinforced concrete plate in the case of a fixed connection between the piles and the platform were almost doubled and much higher than the allowable reinforced concrete stress and even exceeded the ultimate design strength and thus the environmental loads acting on a pile-supported riverine offshore platform may cause collapse if they are not properly considered in the structural analysis and design.
Global classification of human facial healthy skin using PLS discriminant analysis and clustering analysis.

Science.gov (United States)

Guinot, C; Latreille, J; Tenenhaus, M; Malvy, D J

2001-04-01

Today's classifications of healthy skin are predominantly based on a very limited number of skin characteristics, such as skin oiliness or susceptibility to sun exposure. The aim of the present analysis was to set up a global classification of healthy facial skin, using mathematical models. This classification is based on clinical, biophysical skin characteristics and self-reported information related to the skin, as well as the results of a theoretical skin classification assessed separately for the frontal and the malar zones of the face. In order to maximize the predictive power of the models with a minimum of variables, the Partial Least Square (PLS) discriminant analysis method was used. The resulting PLS components were subjected to clustering analyses to identify the plausible number of clusters and to group the individuals according to their proximities. Using this approach, four PLS components could be constructed and six clusters were found relevant. So, from the 36 hypothetical combinations of the theoretical skin types classification, we tended to a strengthened six classes proposal. Our data suggest that the association of the PLS discriminant analysis and the clustering methods leads to a valid and simple way to classify healthy human skin and represents a potentially useful tool for cosmetic and dermatological research.
Fatigue analysis of assembled marine floating platform for special purposes under complex water environments

Science.gov (United States)

Ma, Guang-ying; Yao, Yun-long

2018-03-01

In this paper, the fatigue lives of a new type of assembled marine floating platform for special purposes were studied. Firstly, by using ANSYS AQWA software, the hydrodynamic model of the platform was established. Secondly, the structural stresses under alternating change loads were calculated under complex water environments, such as wind, wave, current and ice. The minimum fatigue lives were obtained under different working conditions. The analysis results showed that the fatigue life of the platform structure can meet the requirements
Application of cluster analysis and unsupervised learning to multivariate tissue characterization

International Nuclear Information System (INIS)

Momenan, R.; Insana, M.F.; Wagner, R.F.; Garra, B.S.; Loew, M.H.

1987-01-01

This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characterization. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data?; b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data
Clustering analysis for muon tomography data elaboration in the Muon Portal project

Science.gov (United States)

Bandieramonte, M.; Antonuccio-Delogu, V.; Becciani, U.; Costa, A.; La Rocca, P.; Massimino, P.; Petta, C.; Pistagna, C.; Riggi, F.; Riggi, S.; Sciacca, E.; Vitello, F.

2015-05-01

Clustering analysis is one of multivariate data analysis techniques which allows to gather statistical data units into groups, in order to minimize the logical distance within each group and to maximize the one between different groups. In these proceedings, the authors present a novel approach to the muontomography data analysis based on clustering algorithms. As a case study we present the Muon Portal project that aims to build and operate a dedicated particle detector for the inspection of harbor containers to hinder the smuggling of nuclear materials. Clustering techniques, working directly on scattering points, help to detect the presence of suspicious items inside the container, acting, as it will be shown, as a filter for a preliminary analysis of the data.
Subtypes of autism by cluster analysis based on structural MRI data.

Science.gov (United States)

Hrdlicka, Michal; Dudova, Iva; Beranova, Irena; Lisy, Jiri; Belsan, Tomas; Neuwirth, Jiri; Komarek, Vladimir; Faladova, Ludvika; Havlovicova, Marketa; Sedlacek, Zdenek; Blatny, Marek; Urbanek, Tomas

2005-05-01

The aim of our study was to subcategorize Autistic Spectrum Disorders (ASD) using a multidisciplinary approach. Sixty four autistic patients (mean age 9.4+/-5.6 years) were entered into a cluster analysis. The clustering analysis was based on MRI data. The clusters obtained did not differ significantly in the overall severity of autistic symptomatology as measured by the total score on the Childhood Autism Rating Scale (CARS). The clusters could be characterized as showing significant differences: Cluster 1: showed the largest sizes of the genu and splenium of the corpus callosum (CC), the lowest pregnancy order and the lowest frequency of facial dysmorphic features. Cluster 2: showed the largest sizes of the amygdala and hippocampus (HPC), the least abnormal visual response on the CARS, the lowest frequency of epilepsy and the least frequent abnormal psychomotor development during the first year of life. Cluster 3: showed the largest sizes of the caput of the nucleus caudatus (NC), the smallest sizes of the HPC and facial dysmorphic features were always present. Cluster 4: showed the smallest sizes of the genu and splenium of the CC, as well as the amygdala, and caput of the NC, the most abnormal visual response on the CARS, the highest frequency of epilepsy, the highest pregnancy order, abnormal psychomotor development during the first year of life was always present and facial dysmorphic features were always present. This multidisciplinary approach seems to be a promising method for subtyping autism.

Symptom Clusters in People Living with HIV Attending Five Palliative Care Facilities in Two Sub-Saharan African Countries: A Hierarchical Cluster Analysis.

Science.gov (United States)

Moens, Katrien; Siegert, Richard J; Taylor, Steve; Namisango, Eve; Harding, Richard

2015-01-01

Symptom research across conditions has historically focused on single symptoms, and the burden of multiple symptoms and their interactions has been relatively neglected especially in people living with HIV. Symptom cluster studies are required to set priorities in treatment planning, and to lessen the total symptom burden. This study aimed to identify and compare symptom clusters among people living with HIV attending five palliative care facilities in two sub-Saharan African countries. Data from cross-sectional self-report of seven-day symptom prevalence on the 32-item Memorial Symptom Assessment Scale-Short Form were used. A hierarchical cluster analysis was conducted using Ward's method applying squared Euclidean Distance as the similarity measure to determine the clusters. Contingency tables, X2 tests and ANOVA were used to compare the clusters by patient specific characteristics and distress scores. Among the sample (N=217) the mean age was 36.5 (SD 9.0), 73.2% were female, and 49.1% were on antiretroviral therapy (ART). The cluster analysis produced five symptom clusters identified as: 1) dermatological; 2) generalised anxiety and elimination; 3) social and image; 4) persistently present; and 5) a gastrointestinal-related symptom cluster. The patients in the first three symptom clusters reported the highest physical and psychological distress scores. Patient characteristics varied significantly across the five clusters by functional status (worst functional physical status in cluster one, ppeople living with HIV with longitudinally collected symptom data to test cluster stability and identify common symptom trajectories is recommended.
Performance Analysis and Scaling Behavior of the Terrestrial Systems Modeling Platform TerrSysMP in Large-Scale Supercomputing Environments

Science.gov (United States)

Kollet, S. J.; Goergen, K.; Gasper, F.; Shresta, P.; Sulis, M.; Rihani, J.; Simmer, C.; Vereecken, H.

2013-12-01

In studies of the terrestrial hydrologic, energy and biogeochemical cycles, integrated multi-physics simulation platforms take a central role in characterizing non-linear interactions, variances and uncertainties of system states and fluxes in reciprocity with observations. Recently developed integrated simulation platforms attempt to honor the complexity of the terrestrial system across multiple time and space scales from the deeper subsurface including groundwater dynamics into the atmosphere. Technically, this requires the coupling of atmospheric, land surface, and subsurface-surface flow models in supercomputing environments, while ensuring a high-degree of efficiency in the utilization of e.g., standard Linux clusters and massively parallel resources. A systematic performance analysis including profiling and tracing in such an application is crucial in the understanding of the runtime behavior, to identify optimum model settings, and is an efficient way to distinguish potential parallel deficiencies. On sophisticated leadership-class supercomputers, such as the 28-rack 5.9 petaFLOP IBM Blue Gene/Q 'JUQUEEN' of the Jülich Supercomputing Centre (JSC), this is a challenging task, but even more so important, when complex coupled component models are to be analysed. Here we want to present our experience from coupling, application tuning (e.g. 5-times speedup through compiler optimizations), parallel scaling and performance monitoring of the parallel Terrestrial Systems Modeling Platform TerrSysMP. The modeling platform consists of the weather prediction system COSMO of the German Weather Service; the Community Land Model, CLM of NCAR; and the variably saturated surface-subsurface flow code ParFlow. The model system relies on the Multiple Program Multiple Data (MPMD) execution model where the external Ocean-Atmosphere-Sea-Ice-Soil coupler (OASIS3) links the component models. TerrSysMP has been instrumented with the performance analysis tool Scalasca and analyzed
Assessment of Random Assignment in Training and Test Sets using Generalized Cluster Analysis Technique

Directory of Open Access Journals (Sweden)

Sorana D. BOLBOACĂ

2011-06-01

Full Text Available Aim: The properness of random assignment of compounds in training and validation sets was assessed using the generalized cluster technique. Material and Method: A quantitative Structure-Activity Relationship model using Molecular Descriptors Family on Vertices was evaluated in terms of assignment of carboquinone derivatives in training and test sets during the leave-many-out analysis. Assignment of compounds was investigated using five variables: observed anticancer activity and four structure descriptors. Generalized cluster analysis with K-means algorithm was applied in order to investigate if the assignment of compounds was or not proper. The Euclidian distance and maximization of the initial distance using a cross-validation with a v-fold of 10 was applied. Results: All five variables included in analysis proved to have statistically significant contribution in identification of clusters. Three clusters were identified, each of them containing both carboquinone derivatives belonging to training as well as to test sets. The observed activity of carboquinone derivatives proved to be normal distributed on every. The presence of training and test sets in all clusters identified using generalized cluster analysis with K-means algorithm and the distribution of observed activity within clusters sustain a proper assignment of compounds in training and test set. Conclusion: Generalized cluster analysis using the K-means algorithm proved to be a valid method in assessment of random assignment of carboquinone derivatives in training and test sets.
Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

Directory of Open Access Journals (Sweden)

Martinez Fernando J

2010-03-01

Full Text Available Abstract Background Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1 post-bronchodilator FEV1 percent predicted, 2 percent bronchodilator responsiveness, and quantitative CT measurements of 3 apical emphysema and 4 airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1 emphysema predominant, 2 bronchodilator responsive, with higher FEV1; 3 discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4 airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant was associated with TGFB1 SNP rs1800470. Conclusions Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.
Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

Science.gov (United States)

Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

2016-04-01

Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Tri-Laboratory Linux Capacity Cluster 2007 SOW

Energy Technology Data Exchange (ETDEWEB)

Seager, M

2007-03-22

The Advanced Simulation and Computing (ASC) Program (formerly know as Accelerated Strategic Computing Initiative, ASCI) has led the world in capability computing for the last ten years. Capability computing is defined as a world-class platform (in the Top10 of the Top500.org list) with scientific simulations running at scale on the platform. Example systems are ASCI Red, Blue-Pacific, Blue-Mountain, White, Q, RedStorm, and Purple. ASC applications have scaled to multiple thousands of CPUs and accomplished a long list of mission milestones on these ASC capability platforms. However, the computing demands of the ASC and Stockpile Stewardship programs also include a vast number of smaller scale runs for day-to-day simulations. Indeed, every 'hero' capability run requires many hundreds to thousands of much smaller runs in preparation and post processing activities. In addition, there are many aspects of the Stockpile Stewardship Program (SSP) that can be directly accomplished with these so-called 'capacity' calculations. The need for capacity is now so great within the program that it is increasingly difficult to allocate the computer resources required by the larger capability runs. To rectify the current 'capacity' computing resource shortfall, the ASC program has allocated a large portion of the overall ASC platforms budget to 'capacity' systems. In addition, within the next five to ten years the Life Extension Programs (LEPs) for major nuclear weapons systems must be accomplished. These LEPs and other SSP programmatic elements will further drive the need for capacity calculations and hence 'capacity' systems as well as future ASC capability calculations on 'capability' systems. To respond to this new workload analysis, the ASC program will be making a large sustained strategic investment in these capacity systems over the next ten years, starting with the United States Government Fiscal Year 2007 (GFY07
Tri-Laboratory Linux Capacity Cluster 2007 SOW

International Nuclear Information System (INIS)

Seager, M.

2007-01-01

The Advanced Simulation and Computing (ASC) Program (formerly know as Accelerated Strategic Computing Initiative, ASCI) has led the world in capability computing for the last ten years. Capability computing is defined as a world-class platform (in the Top10 of the Top500.org list) with scientific simulations running at scale on the platform. Example systems are ASCI Red, Blue-Pacific, Blue-Mountain, White, Q, RedStorm, and Purple. ASC applications have scaled to multiple thousands of CPUs and accomplished a long list of mission milestones on these ASC capability platforms. However, the computing demands of the ASC and Stockpile Stewardship programs also include a vast number of smaller scale runs for day-to-day simulations. Indeed, every 'hero' capability run requires many hundreds to thousands of much smaller runs in preparation and post processing activities. In addition, there are many aspects of the Stockpile Stewardship Program (SSP) that can be directly accomplished with these so-called 'capacity' calculations. The need for capacity is now so great within the program that it is increasingly difficult to allocate the computer resources required by the larger capability runs. To rectify the current 'capacity' computing resource shortfall, the ASC program has allocated a large portion of the overall ASC platforms budget to 'capacity' systems. In addition, within the next five to ten years the Life Extension Programs (LEPs) for major nuclear weapons systems must be accomplished. These LEPs and other SSP programmatic elements will further drive the need for capacity calculations and hence 'capacity' systems as well as future ASC capability calculations on 'capability' systems. To respond to this new workload analysis, the ASC program will be making a large sustained strategic investment in these capacity systems over the next ten years, starting with the United States Government Fiscal Year 2007 (GFY07). However, given the growing need for 'capability' systems as
p-tert-Butylcalix[8]arene: an extremely versatile platform for cluster formation.

Science.gov (United States)

Taylor, Stephanie M; Sanz, Sergio; McIntosh, Ruaraidh D; Beavers, Christine M; Teat, Simon J; Brechin, Euan K; Dalgarno, Scott J

2012-12-07

p-tert-Butylcalix[4]arene is a bowl-shaped molecule capable of forming a range of polynuclear metal clusters under different experimental conditions. p-tert-Butylcalix[8]arene (TBC[8]) is a significantly more flexible analogue that has previously been shown to form mono- and binuclear lanthanide (Ln) metal complexes. The latter (cluster) motif is commonly observed and involves the calixarene adopting a near double-cone conformation, features of which suggested that it may be exploited as a type of assembly node in the formation of larger polynuclear lanthanide clusters. Variation in the experimental conditions employed for this system provides access to Ln(1), Ln(2), Ln(4), Ln(5), Ln(6), Ln(7) and Ln(8) complexes, with all polymetallic clusters containing the common binuclear lanthanide fragment. Closer inspection of the structures of the polymetallic clusters reveals that all but one (Ln(8)) are in fact based on metal octahedra or the building blocks of octahedra, with the identity and size of the final product dependent upon the basicity of the solution and the deprotonation level of the TBC[8] ligand. This demonstrates both the versatility of the ligand towards incorporation of additional metal centres, and the associated implications for tailoring the magnetic properties of the resulting assemblies in which lanthanide centres may be interchanged. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

Directory of Open Access Journals (Sweden)

Marta Brozynska

Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.
Cluster Analysis of Maize Inbred Lines

Directory of Open Access Journals (Sweden)

Jiban Shrestha

2016-12-01

Full Text Available The determination of diversity among inbred lines is important for heterosis breeding. Sixty maize inbred lines were evaluated for their eight agro morphological traits during winter season of 2011 to analyze their genetic diversity. Clustering was done by average linkage method. The inbred lines were grouped into six clusters. Inbred lines grouped into Clusters II had taller plants with maximum number of leaves. The cluster III was characterized with shorter plants with minimum number of leaves. The inbred lines categorized into cluster V had early flowering whereas the group into cluster VI had late flowering time. The inbred lines grouped into the cluster III were characterized by higher value of anthesis silking interval (ASI and those of cluster VI had lower value of ASI. These results showed that the inbred lines having widely divergent clusters can be utilized in hybrid breeding programme.
[Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

Science.gov (United States)

Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

2011-11-01

The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.
Statistical Techniques Applied to Aerial Radiometric Surveys (STAARS): cluster analysis. National Uranium Resource Evaluation

International Nuclear Information System (INIS)

Pirkle, F.L.; Stablein, N.K.; Howell, J.A.; Wecksung, G.W.; Duran, B.S.

1982-11-01

One objective of the aerial radiometric surveys flown as part of the US Department of Energy's National Uranium Resource Evaluation (NURE) program was to ascertain the regional distribution of near-surface radioelement abundances. Some method for identifying groups of observations with similar radioelement values was therefore required. It is shown in this report that cluster analysis can identify such groups even when no a priori knowledge of the geology of an area exists. A method of convergent k-means cluster analysis coupled with a hierarchical cluster analysis is used to classify 6991 observations (three radiometric variables at each observation location) from the Precambrian rocks of the Copper Mountain, Wyoming, area. Another method, one that combines a principal components analysis with a convergent k-means analysis, is applied to the same data. These two methods are compared with a convergent k-means analysis that utilizes available geologic knowledge. All three methods identify four clusters. Three of the clusters represent background values for the Precambrian rocks of the area, and one represents outliers (anomalously high 214 Bi). A segmentation of the data corresponding to geologic reality as discovered by other methods has been achieved based solely on analysis of aerial radiometric data. The techniques employed are composites of classical clustering methods designed to handle the special problems presented by large data sets. 20 figures, 7 tables
SVIP-N 1.0: An integrated visualization platform for neutronics analysis

International Nuclear Information System (INIS)

Luo Yuetong; Long Pengcheng; Wu Guoyong; Zeng Qin; Hu Liqin; Zou Jun

2010-01-01

Post-processing is an important part of neutronics analysis, and SVIP-N 1.0 (scientific visualization integrated platform for neutronics analysis) is designed to ease post-processing of neutronics analysis through visualization technologies. Main capabilities of SVIP-N 1.0 include: (1) ability of manage neutronics analysis result; (2) ability to preprocess neutronics analysis result; (3) ability to visualization neutronics analysis result data in different way. The paper describes the system architecture and main features of SVIP-N, some advanced visualization used in SVIP-N 1.0 and some preliminary applications, such as ITER.
Horizontally scaling dCache SRM with the Terracotta platform

International Nuclear Information System (INIS)

Perelmutov, T; Crawford, M; Moibenko, A; Oleynik, G

2011-01-01

The dCache disk caching file system has been chosen by a majority of LHC experiments' Tier 1 centers for their data storage needs. It is also deployed at many Tier 2 centers. The Storage Resource Manager (SRM) is a standardized grid storage interface and a single point of remote entry into dCache, and hence is a critical component. SRM must scale to increasing transaction rates and remain resilient against changing usage patterns. The initial implementation of the SRM service in dCache suffered from an inability to support clustered deployment, and its performance was limited by the hardware of a single node. Using the Terracotta platform[l], we added the ability to horizontally scale the dCache SRM service to run on multiple nodes in a cluster configuration, coupled with network load balancing. This gives site administrators the ability to increase the performance and reliability of SRM service to face the ever-increasing requirements of LHC data handling. In this paper we will describe the previous limitations of the architecture SRM server and how the Terracotta platform allowed us to readily convert single node service into a highly scalable clustered application.
Horizontally scaling dChache SRM with the Terracotta platform

International Nuclear Information System (INIS)

Perelmutov, T.; Crawford, M.; Moibenko, A.; Oleynik, G.

2011-01-01

The dCache disk caching file system has been chosen by a majority of LHC experiments Tier 1 centers for their data storage needs. It is also deployed at many Tier 2 centers. The Storage Resource Manager (SRM) is a standardized grid storage interface and a single point of remote entry into dCache, and hence is a critical component. SRM must scale to increasing transaction rates and remain resilient against changing usage patterns. The initial implementation of the SRM service in dCache suffered from an inability to support clustered deployment, and its performance was limited by the hardware of a single node. Using the Terracotta platform, we added the ability to horizontally scale the dCache SRM service to run on multiple nodes in a cluster configuration, coupled with network load balancing. This gives site administrators the ability to increase the performance and reliability of SRM service to face the ever-increasing requirements of LHC data handling. In this paper we will describe the previous limitations of the architecture SRM server and how the Terracotta platform allowed us to readily convert single node service into a highly scalable clustered application.
A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark

Energy Technology Data Exchange (ETDEWEB)

Gittens, Alex; Kottalam, Jey; Yang, Jiyan; Ringenburg, Michael, F.; Chhugani, Jatin; Racah, Evan; Singh, Mohitdeep; Yao, Yushu; Fischer, Curt; Ruebel, Oliver; Bowen, Benjamin; Lewis, Norman, G.; Mahoney, Michael, W.; Krishnamurthy, Venkat; Prabhat, Mr

2017-07-27

We investigate the performance and scalability of the randomized CX low-rank matrix factorization and demonstrate its applicability through the analysis of a 1TB mass spectrometry imaging (MSI) dataset, using Apache Spark on an Amazon EC2 cluster, a Cray XC40 system, and an experimental Cray cluster. We implemented this factorization both as a parallelized C implementation with hand-tuned optimizations and in Scala using the Apache Spark high-level cluster computing framework. We obtained consistent performance across the three platforms: using Spark we were able to process the 1TB size dataset in under 30 minutes with 960 cores on all systems, with the fastest times obtained on the experimental Cray cluster. In comparison, the C implementation was 21X faster on the Amazon EC2 system, due to careful cache optimizations, bandwidth-friendly access of matrices and vector computation using SIMD units. We report these results and their implications on the hardware and software issues arising in supporting data-centric workloads in parallel and distributed environments.
Gene ARMADA: an integrated multi-analysis platform for microarray data implemented in MATLAB.

Science.gov (United States)

Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N

2009-10-27

The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime. Gene ARMADA provides a
Performance Evaluation of Hadoop-based Large-scale Network Traffic Analysis Cluster

Directory of Open Access Journals (Sweden)

Tao Ran

2016-01-01

Full Text Available As Hadoop has gained popularity in big data era, it is widely used in various fields. The self-design and self-developed large-scale network traffic analysis cluster works well based on Hadoop, with off-line applications running on it to analyze the massive network traffic data. On purpose of scientifically and reasonably evaluating the performance of analysis cluster, we propose a performance evaluation system. Firstly, we set the execution times of three benchmark applications as the benchmark of the performance, and pick 40 metrics of customized statistical resource data. Then we identify the relationship between the resource data and the execution times by a statistic modeling analysis approach, which is composed of principal component analysis and multiple linear regression. After training models by historical data, we can predict the execution times by current resource data. Finally, we evaluate the performance of analysis cluster by the validated predicting of execution times. Experimental results show that the predicted execution times by trained models are within acceptable error range, and the evaluation results of performance are accurate and reliable.
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

Energy Technology Data Exchange (ETDEWEB)

Lalonde, Michel, E-mail: mlalonde15@rogers.com; Wassenaar, Richard [Department of Physics, Carleton University, Ottawa, Ontario K1S 5B6 (Canada); Wells, R. Glenn; Birnie, David; Ruddy, Terrence D. [Division of Cardiology, University of Ottawa Heart Institute, Ottawa, Ontario K1Y 4W7 (Canada)

2014-07-15

Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

International Nuclear Information System (INIS)

Lalonde, Michel; Wassenaar, Richard; Wells, R. Glenn; Birnie, David; Ruddy, Terrence D.

2014-01-01

Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster

Application and Mechanics Analysis of Multi-Function Construction Platforms in Prefabricated-Concrete Construction

Science.gov (United States)

Wang, Meihua; Li, Rongshuai; Zhang, Wenze

2017-11-01

Multi-function construction platforms (MCPs) as an “old construction technology, new application” of the building facade construction equipment, its efforts to reduce labour intensity, improve labour productivity, ensure construction safety, shorten the duration of construction and other aspects of the effect are significant. In this study, the functional analysis of the multi-function construction platforms is carried out in the construction of the assembly building. Based on the general finite element software ANSYS, the static calculation and dynamic characteristics analysis of the MCPs structure are analysed, the simplified finite element model is constructed, and the selection of the unit, the processing and solution of boundary are under discussion and research. The maximum deformation value, the maximum stress value and the structural dynamic characteristic model are obtained. The dangerous parts of the platform structure are analysed, too. Multiple types of MCPs under engineering construction conditions are calculated, so as to put forward the rationalization suggestions for engineering application of the MCPs.
Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

Energy Technology Data Exchange (ETDEWEB)

Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

2008-05-12

The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.
Analysis of RXTE data on Clusters of Galaxies

Science.gov (United States)

Petrosian, Vahe

2004-01-01

This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument
Profitability and efficiency of Italian utilities: cluster analysis of financial statement ratios

International Nuclear Information System (INIS)

Linares, E.

2008-01-01

The last ten years have witnessed conspicuous changes in European and Italian regulation of public utility services and in the strategies of the major players in these fields. In response to these changes Italian utilities have made a variety of choices regarding size, presence in more or less capital-intensive stages of different value chains, and diversification. These choices have been implemented both through internal growth and by means of mergers and acquisitions. In this context it is interesting to try to establish whether there is a nexus between these choices and the performance of Italian utilities in terms of profitability and efficiency. Therefore statistical multivariate analysis techniques (cluster analysis and factor analysis) have been applied to several ratios obtained from the 2005 financial statement of 34 utilities. First, a hierarchical cluster analysis method has been applied to financial statement data in order to identify homogeneous groups based on several indicators of the incidence of costs (external costs, personnel costs, depreciation and amortization), profitability (return on sales, return on assets, return on equity) and efficiency (in the utilization of personnel, of total assets, of property, plant and equipment). Five clusters have been found. Then the clusters have been characterized in terms of the aforementioned indicators, the presence in different stages of the energy value chains (electricity and gas) and other descriptive variables (such as turnover, number of employees, assets, percentage of property, plant and equipment on total assets, sales revenues from electricity, gas, water supply and sanitation, waste collection and treatment and other services). In a second round cluster analysis has been preceded by factor analysis, in order to find a smaller set of variables. This procedure has revealed three not directly observable factors that can be interpreted as follows: i) efficiency in ordinary and financial management
The JASMIN Analysis Platform - bridging the gap between traditional climate data practicies and data-centric analysis paradigms

Science.gov (United States)

Pascoe, Stephen; Iwi, Alan; kershaw, philip; Stephens, Ag; Lawrence, Bryan

2014-05-01

The advent of large-scale data and the consequential analysis problems have led to two new challenges for the research community: how to share such data to get the maximum value and how to carry out efficient analysis. Solving both challenges require a form of parallelisation: the first is social parallelisation (involving trust and information sharing), the second data parallelisation (involving new algorithms and tools). The JASMIN infrastructure supports both kinds of parallelism by providing a multi-tennent environment with petabyte-scale storage, VM provisioning and batch cluster facilities. The JASMIN Analysis Platform (JAP) is an analysis software layer for JASMIN which emphasises ease of transition from a researcher's local environment to JASMIN. JAP brings together tools traditionally used by multiple communities and configures them to work together, enabling users to move analysis from their local environment to JASMIN without rewriting code. JAP also provides facilities to exploit JASMIN's parallel capabilities whilst maintaining their familiar analysis environment where ever possible. Modern opensource analysis tools typically have multiple dependent packages, increasing the installation burden on system administrators. When you consider a suite of tools, often with both common and conflicting dependencies, analysis pipelines can become locked to a particular installation simply because of the effort required to reconstruct the dependency tree. JAP addresses this problem by providing a consistent suite of RPMs compatible with RedHat Enterprise Linux and CentOS 6.4. Researchers can install JAP locally, either as RPMs or through a pre-built VM image, giving them the confidence to know moving analysis to JASMIN will not disrupt their environment. Analysis parallelisation is in it's infancy in climate sciences, with few tools capable of exploiting any parallel environment beyond manual scripting of the use of multiple processors. JAP begins to bridge this
A formal concept analysis approach to consensus clustering of multi-experiment expression data

Science.gov (United States)

2014-01-01

Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological
Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine.

Science.gov (United States)

Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang

2017-01-01

Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.
E-learning platform for automated testing of electronic circuits using signature analysis method

Science.gov (United States)

Gherghina, Cǎtǎlina; Bacivarov, Angelica; Bacivarov, Ioan C.; Petricǎ, Gabriel

2016-12-01

Dependability of electronic circuits can be ensured only through testing of circuit modules. This is done by generating test vectors and their application to the circuit. Testability should be viewed as a concerted effort to ensure maximum efficiency throughout the product life cycle, from conception and design stage, through production to repairs during products operating. In this paper, is presented the platform developed by authors for training for testability in electronics, in general and in using signature analysis method, in particular. The platform allows highlighting the two approaches in the field namely analog and digital signature of circuits. As a part of this e-learning platform, it has been developed a database for signatures of different electronic components meant to put into the spotlight different techniques implying fault detection, and from this there were also self-repairing techniques of the systems with this kind of components. An approach for realizing self-testing circuits based on MATLAB environment and using signature analysis method is proposed. This paper analyses the benefits of signature analysis method and simulates signature analyzer performance based on the use of pseudo-random sequences, too.
CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.

Science.gov (United States)

Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B

2011-08-15

Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.
Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis

Science.gov (United States)

Yen, Chi-Fu; Sivasankar, Sanjeevi

2018-03-01

Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.
Fatigue Feature Extraction Analysis based on a K-Means Clustering Approach

Directory of Open Access Journals (Sweden)

M.F.M. Yunoh

2015-06-01

Full Text Available This paper focuses on clustering analysis using a K-means approach for fatigue feature dataset extraction. The aim of this study is to group the dataset as closely as possible (homogeneity for the scattered dataset. Kurtosis, the wavelet-based energy coefficient and fatigue damage are calculated for all segments after the extraction process using wavelet transform. Kurtosis, the wavelet-based energy coefficient and fatigue damage are used as input data for the K-means clustering approach. K-means clustering calculates the average distance of each group from the centroid and gives the objective function values. Based on the results, maximum values of the objective function can be seen in the two centroid clusters, with a value of 11.58. The minimum objective function value is found at 8.06 for five centroid clusters. It can be seen that the objective function with the lowest value for the number of clusters is equal to five; which is therefore the best cluster for the dataset.
The dynamics of cyclone clustering in re-analysis and a high-resolution climate model

Science.gov (United States)

Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len

2017-04-01

Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.
Product competitiveness analysis for e-commerce platform of special agricultural products

Science.gov (United States)

Wan, Fucheng; Ma, Ning; Yang, Dongwei; Xiong, Zhangyuan

2017-09-01

On the basis of analyzing the influence factors of the product competitiveness of the e-commerce platform of the special agricultural products and the characteristics of the analytical methods for the competitiveness of the special agricultural products, the price, the sales volume, the postage included service, the store reputation, the popularity, etc. were selected in this paper as the dimensionality for analyzing the competitiveness of the agricultural products, and the principal component factor analysis was taken as the competitiveness analysis method. Specifically, the web crawler was adopted to capture the information of various special agricultural products in the e-commerce platform ---- chi.taobao.com. Then, the original data captured thereby were preprocessed and MYSQL database was adopted to establish the information library for the special agricultural products. Then, the principal component factor analysis method was adopted to establish the analysis model for the competitiveness of the special agricultural products, and SPSS was adopted in the principal component factor analysis process to obtain the competitiveness evaluation factor system (support degree factor, price factor, service factor and evaluation factor) of the special agricultural products. Then, the linear regression method was adopted to establish the competitiveness index equation of the special agricultural products for estimating the competitiveness of the special agricultural products.
Cluster analysis of HZE particle tracks as applied to space radiobiology problems

International Nuclear Information System (INIS)

Batmunkh, M.; Bayarchimeg, L.; Lkhagva, O.; Belov, O.

2013-01-01

A cluster analysis is performed of ionizations in tracks produced by the most abundant nuclei in the charge and energy spectra of the galactic cosmic rays. The frequency distribution of clusters is estimated for cluster sizes comparable to the DNA molecule at different packaging levels. For this purpose, an improved K-mean-based algorithm is suggested. This technique allows processing particle tracks containing a large number of ionization events without setting the number of clusters as an input parameter. Using this method, the ionization distribution pattern is analyzed depending on the cluster size and particle's linear energy transfer
Multi-platform ’Omics Analysis of Human Ebola Virus Disease Pathogenesis

Energy Technology Data Exchange (ETDEWEB)

Eisfeld, Amie J.; Halfmann, Peter J.; Wendler, Jason P.; Kyle, Jennifer E.; Burnum-Johnson, Kristin E.; Peralta, Zuleyma; Maemura, Tadashi; Walters, Kevin B.; Watanabe, Tokiko; Fukuyama, Satoshi; Yamashita, Makoto; Jacobs, Jon M.; Kim, Young-Mo; Casey, Cameron P.; Stratton, Kelly G.; Webb-Robertson, Bobbie-Jo M.; Gritsenko, Marina A.; Monroe, Matthew E.; Weitz, Karl K.; Shukla, Anil K.; Tian, Mingyuan; Neumann, Gabriele; Reed, Jennifer L.; van Bakel, Harm; Metz, Thomas O.; Smith, Richard D.; Waters, Katrina M.; N' jai, Alhaji; Sahr, Foday; Kawaoka, Yoshihiro

2017-12-01

The pathogenesis of human Ebola virus disease (EVD) is complex. EVD is characterized by high levels of virus replication and dissemination, dysregulated immune responses, extensive virus- and host-mediated tissue damage, and disordered coagulation. To clarify how host responses contribute to EVD pathophysiology, we performed multi-platform ’omics analysis of peripheral blood mononuclear cells and plasma from EVD patients. Our results indicate that EVD molecular signatures overlap with those of sepsis, imply that pancreatic enzymes contribute to tissue damage in fatal EVD, and suggest that Ebola virus infection may induce aberrant neutrophils whose activity could explain hallmarks of fatal EVD. Moreover, integrated biomarker prediction identified putative biomarkers from different data platforms that differentiated survivors and fatalities early after infection. This work reveals insight into EVD pathogenesis, suggests an effective approach for biomarker identification, and provides an important community resource for further analysis of human EVD severity.
Resolving Carbonate Platform Geometries on the Island of Bonaire, Caribbean Netherlands through Semi-Automatic GPR Facies Classification

Science.gov (United States)

Bowling, R. D.; Laya, J. C.; Everett, M. E.

2018-05-01

The study of exposed carbonate platforms provides observational constraints on regional tectonics and sea-level history. In this work Miocene-aged carbonate platform units of the Seroe Domi Formation are investigated, on the island of Bonaire, located in the Southern Caribbean. Ground penetrating radar (GPR) was used to probe near-surface structural geometries associated with these lithologies. The single cross-island transect described herein allowed for continuous mapping of geologic structures on kilometer length scales. Numerical analysis was applied to the data in the form of k-means clustering of structure-parallel vectors derived from image structure tensors. This methodology enables radar facies along the survey transect to be semi-automatically mapped. The results provide subsurface evidence to support previous surficial and outcrop observations, and reveal complex stratigraphy within the platform. From the GPR data analysis, progradational clinoform geometries were observed on the northeast side of the island which supports the tectonics and depositional trends of the region. Furthermore, several leeward-side radar facies are identified which correlate to environments of deposition conducive to dolomitization via reflux mechanisms.
Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

Science.gov (United States)

Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

2014-07-01

Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.
Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

Science.gov (United States)

Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

2017-12-01

Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.
Influence of birth cohort on age of onset cluster analysis in bipolar I disorder

DEFF Research Database (Denmark)

Bauer, M; Glenn, T; Alda, M

2015-01-01

Purpose: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset...... cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. Results: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After...... on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more...
MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

Directory of Open Access Journals (Sweden)

Valentina Meuti

2014-01-01

Full Text Available Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2. A clinical group of subjects with perinatal depression (PND, 55 subjects was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3 and an “apparently common” one (cluster 2. The first cluster (39.5% collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95% includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5% shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions.

MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

Science.gov (United States)

Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

2014-01-01

Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499
A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

Science.gov (United States)

Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

2015-01-01

Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.
A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.

Science.gov (United States)

Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei

2012-01-01

DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis

Science.gov (United States)

Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan

2016-01-01

Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Evaluation of Portland cement from X-ray diffraction associated with cluster analysis

International Nuclear Information System (INIS)

Gobbo, Luciano de Andrade; Montanheiro, Tarcisio Jose; Montanheiro, Filipe; Sant'Agostino, Lilia Mascarenhas

2013-01-01

The Brazilian cement industry produced 64 million tons of cement in 2012, with noteworthy contribution of CP-II (slag), CP-III (blast furnace) and CP-IV (pozzolanic) cements. The industrial pole comprises about 80 factories that utilize raw materials of different origins and chemical compositions that require enhanced analytical technologies to optimize production in order to gain space in the growing consumer market in Brazil. This paper assesses the sensitivity of mineralogical analysis by X-ray diffraction associated with cluster analysis to distinguish different kinds of cements with different additions. This technique can be applied, for example, in the prospection of different types of limestone (calcitic, dolomitic and siliceous) as well as in the qualification of different clinkers. The cluster analysis does not require any specific knowledge of the mineralogical composition of the diffractograms to be clustered; rather, it is based on their similarity. The materials tested for addition have different origins: fly ashes from different power stations from South Brazil and slag from different steel plants in the Southeast. Cement with different additions of limestone and white Portland cement were also used. The Rietveld method of qualitative and quantitative analysis was used for measuring the results generated by the cluster analysis technique. (author)
Using Cluster Analysis to Group Countries for Cost-effectiveness Analysis: An Application to Sub-Saharan Africa.

Science.gov (United States)

Russell, Louise B; Bhanot, Gyan; Kim, Sun-Young; Sinha, Anushua

2018-02-01

To explore the use of cluster analysis to define groups of similar countries for the purpose of evaluating the cost-effectiveness of a public health intervention-maternal immunization-within the constraints of a project budget originally meant for an overall regional analysis. We used the most common cluster analysis algorithm, K-means, and the most common measure of distance, Euclidean distance, to group 37 low-income, sub-Saharan African countries on the basis of 24 measures of economic development, general health resources, and past success in public health programs. The groups were tested for robustness and reviewed by regional disease experts. We explored 2-, 3- and 4-group clustering. Public health performance was consistently important in determining the groups. For the 2-group clustering, for example, infant mortality in Group 1 was 81 per 1,000 live births compared with 51 per 1,000 in Group 2, and 67% of children in Group 1 received DPT immunization compared with 87% in Group 2. The experts preferred four groups to fewer, on the ground that national decision makers would more readily recognize their country among four groups. Clusters defined by K-means clustering made sense to subject experts and allowed a more detailed evaluation of the cost-effectiveness of maternal immunization within the constraint of the project budget. The method may be useful for other evaluations that, without having the resources to conduct separate analyses for each unit, seek to inform decision makers in numerous countries or subdivisions within countries, such as states or counties.
Massively Parallel, Molecular Analysis Platform Developed Using a CMOS Integrated Circuit With Biological Nanopores

Science.gov (United States)

Roever, Stefan

2012-01-01

A massively parallel, low cost molecular analysis platform will dramatically change the nature of protein, molecular and genomics research, DNA sequencing, and ultimately, molecular diagnostics. An integrated circuit (IC) with 264 sensors was fabricated using standard CMOS semiconductor processing technology. Each of these sensors is individually controlled with precision analog circuitry and is capable of single molecule measurements. Under electronic and software control, the IC was used to demonstrate the feasibility of creating and detecting lipid bilayers and biological nanopores using wild type α-hemolysin. The ability to dynamically create bilayers over each of the sensors will greatly accelerate pore development and pore mutation analysis. In addition, the noise performance of the IC was measured to be 30fA(rms). With this noise performance, single base detection of DNA was demonstrated using α-hemolysin. The data shows that a single molecule, electrical detection platform using biological nanopores can be operationalized and can ultimately scale to millions of sensors. Such a massively parallel platform will revolutionize molecular analysis and will completely change the field of molecular diagnostics in the future.
A cluster analysis investigation of workaholism as a syndrome.

Science.gov (United States)

Aziz, Shahnaz; Zickar, Michael J

2006-01-01

Workaholism has been conceptualized as a syndrome although there have been few tests that explicitly consider its syndrome status. The authors analyzed a three-dimensional scale of workaholism developed by Spence and Robbins (1992) using cluster analysis. The authors identified three clusters of individuals, one of which corresponded to Spence and Robbins's profile of the workaholic (high work involvement, high drive to work, low work enjoyment). Consistent with previously conjectured relations with workaholism, individuals in the workaholic cluster were more likely to label themselves as workaholics, more likely to have acquaintances label them as workaholics, and more likely to have lower life satisfaction and higher work-life imbalance. The importance of considering workaholism as a syndrome and the implications for effective interventions are discussed. Copyright 2006 APA.
Sejong Open Cluster Survey (SOS). 0. Target Selection and Data Analysis

Science.gov (United States)

Sung, Hwankyung; Lim, Beomdu; Bessell, Michael S.; Kim, Jinyoung S.; Hur, Hyeonoh; Chun, Moo-Young; Park, Byeong-Gon

2013-06-01

Star clusters are superb astrophysical laboratories containing cospatial and coeval samples of stars with similar chemical composition. We initiate the Sejong Open cluster Survey (SOS) - a project dedicated to providing homogeneous photometry of a large number of open clusters in the SAAO Johnson-Cousins' UBVI system. To achieve our main goal, we pay much attention to the observation of standard stars in order to reproduce the SAAO standard system. Many of our targets are relatively small sparse clusters that escaped previous observations. As clusters are considered building blocks of the Galactic disk, their physical properties such as the initial mass function, the pattern of mass segregation, etc. give valuable information on the formation and evolution of the Galactic disk. The spatial distribution of young open clusters will be used to revise the local spiral arm structure of the Galaxy. In addition, the homogeneous data can also be used to test stellar evolutionary theory, especially concerning rare massive stars. In this paper we present the target selection criteria, the observational strategy for accurate photometry, and the adopted calibrations for data analysis such as color-color relations, zero-age main sequence relations, Sp - M_V relations, Sp - T_{eff} relations, Sp - color relations, and T_{eff} - BC relations. Finally we provide some data analysis such as the determination of the reddening law, the membership selection criteria, and distance determination.
A redox-switchable Au8-cluster sensor.

Science.gov (United States)

Wu, Te-Haw; Hsu, Yu-Yen; Lin, Shu-Yi

2012-07-09

The proof of concept of a simple sensing platform based on the fluorescence of a gold cluster consisting of eight atoms, which is easily manipulated by reduction and oxidation of a specific molecule in the absence of chemical linkers, is demonstrated. Without using any coupling reagents to arrange the distance of the donor-acceptor pair, the fluorescence of the Au(8) -cluster is immediately switched off in the presence of 2-pyridinethiol (2-PyT) quencher. Through an upward-curving Stern-Volmer plot, the system shows complex fluorescence quenching with a combination of static and dynamic quenching processes. To analyze the static quenching constant (V) by a "sphere of action" model, the collisional encounter between the Au(8) -cluster and 2-PyT presents a quenching radius (r) ≈5.8 nm, which is larger than the sum of the radii of the Au(8) -cluster and 2-PyT. This implies that fluorescence quenching can occur even though the Au(8) -cluster and 2-PyT are not very close to each other. The quenching pathway may be derived from a photoinduced electron-transfer process of the encounter pair between the Au(8) -cluster (as an electron donor) and 2-PyT (as an electron acceptor) to allow efficient fluorescence quenching in the absence of coupling reagents. Interestingly, the fluorescence is restored by oxidation of 2-PyT to form the corresponding disulfide compound and then quenched again after the reduction of the disulfide. This redox-switchable fluorescent Au(8) -cluster platform is a novel discovery, and its utility as a promising sensor for detecting H(2) O(2) -generating enzymatic transformations is demonstrated. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

Science.gov (United States)

Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

2015-05-01

To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.
Fault detection of flywheel system based on clustering and principal component analysis

Directory of Open Access Journals (Sweden)

Wang Rixin

2015-12-01

Full Text Available Considering the nonlinear, multifunctional properties of double-flywheel with closed-loop control, a two-step method including clustering and principal component analysis is proposed to detect the two faults in the multifunctional flywheels. At the first step of the proposed algorithm, clustering is taken as feature recognition to check the instructions of “integrated power and attitude control” system, such as attitude control, energy storage or energy discharge. These commands will ask the flywheel system to work in different operation modes. Therefore, the relationship of parameters in different operations can define the cluster structure of training data. Ordering points to identify the clustering structure (OPTICS can automatically identify these clusters by the reachability-plot. K-means algorithm can divide the training data into the corresponding operations according to the reachability-plot. Finally, the last step of proposed model is used to define the relationship of parameters in each operation through the principal component analysis (PCA method. Compared with the PCA model, the proposed approach is capable of identifying the new clusters and learning the new behavior of incoming data. The simulation results show that it can effectively detect the faults in the multifunctional flywheels system.
Technology Clusters Exploration for Patent Portfolio through Patent Abstract Analysis

Directory of Open Access Journals (Sweden)

Gabjo Kim

2016-12-01

Full Text Available This study explores technology clusters through patent analysis. The aim of exploring technology clusters is to grasp competitors’ levels of sustainable research and development (R&D and establish a sustainable strategy for entering an industry. To achieve this, we first grouped the patent documents with similar technologies by applying affinity propagation (AP clustering, which is effective while grouping large amounts of data. Next, in order to define the technology clusters, we adopted the term frequency-inverse document frequency (TF-IDF weight, which lists the terms in order of importance. We collected the patent data of Korean electric car companies from the United States Patent and Trademark Office (USPTO to verify our proposed methodology. As a result, our proposed methodology presents more detailed information on the Korean electric car industry than previous studies.
Application of Cluster Analysis in Assessment of Dietary Habits of Secondary School Students

Directory of Open Access Journals (Sweden)

Zalewska Magdalena

2014-12-01

Full Text Available Maintenance of proper health and prevention of diseases of civilization are now significant public health problems. Nutrition is an important factor in the development of youth, as well as the current and future state of health. The aim of the study was to show the benefits of the application of cluster analysis to assess the dietary habits of high school students. The survey was carried out on 1,631 eighteen-year-old students in seven randomly selected secondary schools in Bialystok using a self-prepared anonymous questionnaire. An evaluation of the time of day meals were eaten and the number of meals consumed was made for the surveyed students. The cluster analysis allowed distinguishing characteristic structures of dietary habits in the observed population. Four clusters were identified, which were characterized by relative internal homogeneity and substantial variation in terms of the number of meals during the day and the time of their consumption. The most important characteristics of cluster 1 were cumulated food ration in 2 or 3 meals and long intervals between meals. Cluster 2 was characterized by eating the recommended number of 4 or 5 meals a day. In the 3rd cluster, students ate 3 meals a day with large intervals between them, and in the 4th they had four meals a day while maintaining proper intervals between them. In all clusters dietary mistakes occurred, but most of them were related to clusters 1 and 3. Cluster analysis allowed for the identification of major flaws in nutrition, which may include irregular eating and skipping meals, and indicated possible connections between eating patterns and disturbances of body weight in the examined population.
Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

Science.gov (United States)

Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

2015-01-01

In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.
Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes.

Science.gov (United States)

Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F

2017-02-01

Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.
Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

DEFF Research Database (Denmark)

Pop, Paul; Eles, Petru; Peng, Zebo

2003-01-01

We present an approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways. We have also proposed a buffer size and worst case queuing delay analysis for the gateways......, responsible for routing inter-cluster traffic. Optimization heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of our approaches....
Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems

DEFF Research Database (Denmark)

Pop, Paul; Eles, Petru; Peng, Zebo

2003-01-01

An approach to schedulability analysis for the synthesis of multi-cluster distributed embedded systems consisting of time-triggered and event-triggered clusters, interconnected via gateways, is presented. A buffer size and worst case queuing delay analysis for the gateways, responsible for routing...... inter-cluster traffic, is also proposed. Optimisation heuristics for the priority assignment and synthesis of bus access parameters aimed at producing a schedulable system with minimal buffer needs have been proposed. Extensive experiments and a real-life example show the efficiency of the approaches....
MiSTIC, an integrated platform for the analysis of heterogeneity in large tumour transcriptome datasets.

Science.gov (United States)

Lemieux, Sebastien; Sargeant, Tobias; Laperrière, David; Ismail, Houssam; Boucher, Geneviève; Rozendaal, Marieke; Lavallée, Vincent-Philippe; Ashton-Beaucage, Dariel; Wilhelm, Brian; Hébert, Josée; Hilton, Douglas J; Mader, Sylvie; Sauvageau, Guy

2017-07-27

Genome-wide transcriptome profiling has enabled non-supervised classification of tumours, revealing different sub-groups characterized by specific gene expression features. However, the biological significance of these subtypes remains for the most part unclear. We describe herein an interactive platform, Minimum Spanning Trees Inferred Clustering (MiSTIC), that integrates the direct visualization and comparison of the gene correlation structure between datasets, the analysis of the molecular causes underlying co-variations in gene expression in cancer samples, and the clinical annotation of tumour sets defined by the combined expression of selected biomarkers. We have used MiSTIC to highlight the roles of specific transcription factors in breast cancer subtype specification, to compare the aspects of tumour heterogeneity targeted by different prognostic signatures, and to highlight biomarker interactions in AML. A version of MiSTIC preloaded with datasets described herein can be accessed through a public web server (http://mistic.iric.ca); in addition, the MiSTIC software package can be obtained (github.com/iric-soft/MiSTIC) for local use with personalized datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

Directory of Open Access Journals (Sweden)

Eils Roland

2005-11-01

Full Text Available Abstract Background The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. Results In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85% were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. Conclusion Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and

DGA Clustering and Analysis: Mastering Modern, Evolving Threats, DGALab

Directory of Open Access Journals (Sweden)

Alexander Chailytko

2016-05-01

Full Text Available Domain Generation Algorithms (DGA is a basic building block used in almost all modern malware. Malware researchers have attempted to tackle the DGA problem with various tools and techniques, with varying degrees of success. We present a complex solution to populate DGA feed using reversed DGAs, third-party feeds, and a smart DGA extraction and clustering based on emulation of a large number of samples. Smart DGA extraction requires no reverse engineering and works regardless of the DGA type or initialization vector, while enabling a cluster-based analysis. Our method also automatically allows analysis of the whole malware family, specific campaign, etc. We present our system and demonstrate its abilities on more than 20 malware families. This includes showing connections between different campaigns, as well as comparing results. Most importantly, we discuss how to utilize the outcome of the analysis to create smarter protections against similar malware.
Genetic Diversity and Relationships of Neolamarckia cadamba (Roxb. Bosser progenies through cluster analysis

Directory of Open Access Journals (Sweden)

M. Preethi Shree

2018-04-01

Full Text Available Genetic diversity analysis was conducted for biometric attributes in 20 progenies of Neolamarckia cadamba. The application of D2 clustering technique in Neolamarckia cadamba genetic resources resolved the 20 progenies into five clusters. The maximum intra cluster distance was shown by the cluster II. The maximum inter cluster distance was recorded between cluster III and V which indicated the presence of wider genetic distance between Neolamarckia cadamba progenies. Among the growth attributes, volume (36.84 % contributed maximum towards genetic divergence followed by bole height, basal diameter, tree height, number of branches in Neolamarckia cadamba progenies.
Predicting healthcare outcomes in prematurely born infants using cluster analysis.

Science.gov (United States)

MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne

2018-05-23

Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation

Directory of Open Access Journals (Sweden)

Li Weizhong

2009-10-01

Full Text Available Abstract Background The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand. Results The new metagenomic data analysis method Rapid Analysis of Multiple Metagenomes with a Clustering and Annotation Pipeline (RAMMCAP was developed using an ultra-fast sequence clustering algorithm, fast protein family annotation tools, and a novel statistical metagenome comparison method that employs a unique graphic interface. RAMMCAP processes extremely large datasets with only moderate computational effort. It identifies raw read clusters and protein clusters that may include novel gene families, and compares metagenomes using clusters or functional annotations calculated by RAMMCAP. In this study, RAMMCAP was applied to the two largest available metagenomic collections, the "Global Ocean Sampling" and the "Metagenomic Profiling of Nine Biomes". Conclusion RAMMCAP is a very fast method that can cluster and annotate one million metagenomic reads in only hundreds of CPU hours. It is available from http://tools.camera.calit2.net/camera/rammcap/.
Study on Adaptive Parameter Determination of Cluster Analysis in Urban Management Cases

Science.gov (United States)

Fu, J. Y.; Jing, C. F.; Du, M. Y.; Fu, Y. L.; Dai, P. P.

2017-09-01

The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object's highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data's spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.
STUDY ON ADAPTIVE PARAMETER DETERMINATION OF CLUSTER ANALYSIS IN URBAN MANAGEMENT CASES

Directory of Open Access Journals (Sweden)

J. Y. Fu

2017-09-01

Full Text Available The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object’s highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data’s spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.
HICOSMO - X-ray analysis of a complete sample of galaxy clusters

Science.gov (United States)

Schellenberger, G.; Reiprich, T.

2017-10-01

Galaxy clusters are known to be the largest virialized objects in the Universe. Based on the theory of structure formation one can use them as cosmological probes, since they originate from collapsed overdensities in the early Universe and witness its history. The X-ray regime provides the unique possibility to measure in detail the most massive visible component, the intra cluster medium. Using Chandra observations of a local sample of 64 bright clusters (HIFLUGCS) we provide total (hydrostatic) and gas mass estimates of each cluster individually. Making use of the completeness of the sample we quantify two interesting cosmological parameters by a Bayesian cosmological likelihood analysis. We find Ω_{M}=0.3±0.01 and σ_{8}=0.79±0.03 (statistical uncertainties) using our default analysis strategy combining both, a mass function analysis and the gas mass fraction results. The main sources of biases that we discuss and correct here are (1) the influence of galaxy groups (higher incompleteness in parent samples and a differing behavior of the L_{x} - M relation), (2) the hydrostatic mass bias (as determined by recent hydrodynamical simulations), (3) the extrapolation of the total mass (comparing various methods), (4) the theoretical halo mass function and (5) other cosmological (non-negligible neutrino mass), and instrumental (calibration) effects.
Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors

Directory of Open Access Journals (Sweden)

Naume B

2008-08-01

Full Text Available Abstract Background Microarray Comparative Genomic Hybridization (array CGH provides a means to examine DNA copy number aberrations. Various platforms, brands and underlying technologies are available, facing the user with many choices regarding platform sensitivity and number, localization, and density distribution of probes. Results We evaluate three different platforms presenting different nature and arrangement of the probes: The Agilent Human Genome CGH Microarray 44 k, the ROMA/NimbleGen Representational Oligonucleotide Microarray 82 k, and the Illumina Human-1 Genotyping 109 k BeadChip, with Agilent being gene oriented, ROMA/NimbleGen being genome oriented, and Illumina being genotyping oriented. We investigated copy number changes in 20 human breast tumor samples representing different gene expression subclasses, using a suite of graphical and statistical methods designed to work across platforms. Despite substantial differences in the composition and spatial distribution of probes, the comparison revealed high overall concordance. Notably however, some short amplifications and deletions of potential biological importance were not detected by all platforms. Both correlation and cluster analysis indicate a somewhat higher similarity between ROMA/NimbleGen and Illumina than between Agilent and the other two platforms. The programs developed for the analysis are available from http://www.ifi.uio.no/bioinf/Projects/. Conclusion We conclude that platforms based on different technology principles reveal similar aberration patterns, although we observed some unique amplification or deletion peaks at various locations, only detected by one of the platforms. The correct platform choice for a particular study is dependent on whether the appointed research intention is gene, genome, or genotype oriented.
Cloud computing for comparative genomics with windows azure platform.

Science.gov (United States)

Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P

2012-01-01

Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.
CHOOSING A HEALTH INSTITUTION WITH MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS IN A POPULATION BASED STUDY

Directory of Open Access Journals (Sweden)

ASLI SUNER

2013-06-01

Full Text Available Multiple correspondence analysis is a method making easy to interpret the categorical variables given in contingency tables, showing the similarities, associations as well as divergences among these variables via graphics on a lower dimensional space. Clustering methods are helped to classify the grouped data according to their similarities and to get useful summarized data from them. In this study, interpretations of multiple correspondence analysis are supported by cluster analysis; factors affecting referred health institute such as age, disease group and health insurance are examined and it is aimed to compare results of the methods.
Identifying influential individuals on intensive care units: using cluster analysis to explore culture.

Science.gov (United States)

Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson

2017-07-01

The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.
Diagnostics of subtropical plants functional state by cluster analysis

Directory of Open Access Journals (Sweden)

Oksana Belous

2016-05-01

Full Text Available The article presents an application example of statistical methods for data analysis on diagnosis of the adaptive capacity of subtropical plants varieties. We depicted selection indicators and basic physiological parameters that were defined as diagnostic. We used evaluation on a set of parameters of water regime, there are: determination of water deficit of the leaves, determining the fractional composition of water and detection parameters of the concentration of cell sap (CCS (for tea culture flushes. These settings are characterized by high liability and high responsiveness to the effects of many abiotic factors that determined the particular care in the selection of plant material for analysis and consideration of the impact on sustainability. On the basis of the experimental data calculated the coefficients of pair correlation between climatic factors and used physiological indicators. The result was a selection of physiological and biochemical indicators proposed to assess the adaptability and included in the basis of methodical recommendations on diagnostics of the functional state of the studied cultures. Analysis of complex studies involving a large number of indicators is quite difficult, especially does not allow to quickly identify the similarity of new varieties for their adaptive responses to adverse factors, and, therefore, to set general requirements to conditions of cultivation. Use of cluster analysis suggests that in the analysis of only quantitative data; define a set of variables used to assess varieties (and the more sampling, the more accurate the clustering will happen, be sure to ascertain the measure of similarity (or difference between objects. It is shown that the identification of diagnostic features, which are subjected to statistical processing, impact the accuracy of the varieties classification. Selection in result of the mono-clusters analysis (variety tea Kolhida; hazelnut Lombardsky red; variety kiwi Monty
The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

NARCIS (Netherlands)

Zhou, Q.; Leng, F.; Leydesdorff, L.

2015-01-01

Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare
Poisson cluster analysis of cardiac arrest incidence in Columbus, Ohio.

Science.gov (United States)

Warden, Craig; Cudnik, Michael T; Sasson, Comilla; Schwartz, Greg; Semple, Hugh

2012-01-01

Scarce resources in disease prevention and emergency medical services (EMS) need to be focused on high-risk areas of out-of-hospital cardiac arrest (OHCA). Cluster analysis using geographic information systems (GISs) was used to find these high-risk areas and test potential predictive variables. This was a retrospective cohort analysis of EMS-treated adults with OHCAs occurring in Columbus, Ohio, from April 1, 2004, through March 31, 2009. The OHCAs were aggregated to census tracts and incidence rates were calculated based on their adult populations. Poisson cluster analysis determined significant clusters of high-risk census tracts. Both census tract-level and case-level characteristics were tested for association with high-risk areas by multivariate logistic regression. A total of 2,037 eligible OHCAs occurred within the city limits during the study period. The mean incidence rate was 0.85 OHCAs/1,000 population/year. There were five significant geographic clusters with 76 high-risk census tracts out of the total of 245 census tracts. In the case-level analysis, being in a high-risk cluster was associated with a slightly younger age (-3 years, adjusted odds ratio [OR] 0.99, 95% confidence interval [CI] 0.99-1.00), not being white, non-Hispanic (OR 0.54, 95% CI 0.45-0.64), cardiac arrest occurring at home (OR 1.53, 95% CI 1.23-1.71), and not receiving bystander cardiopulmonary resuscitation (CPR) (OR 0.77, 95% CI 0.62-0.96), but with higher survival to hospital discharge (OR 1.78, 95% CI 1.30-2.46). In the census tract-level analysis, high-risk census tracts were also associated with a slightly lower average age (-0.1 years, OR 1.14, 95% CI 1.06-1.22) and a lower proportion of white, non-Hispanic patients (-0.298, OR 0.04, 95% CI 0.01-0.19), but also a lower proportion of high-school graduates (-0.184, OR 0.00, 95% CI 0.00-0.00). This analysis identified high-risk census tracts and associated census tract-level and case-level characteristics that can be used to
Microneedle Platforms for Cell Analysis

KAUST Repository

Kavaldzhiev, Mincho

2017-01-01

to the development of micro-needle platforms that offer customized fabrication and new capabilities for enhanced cell analyses. The highest degree of geometrical flexibility is achieved with 3D printed micro-needles, which enable optimizing the topographical stress
Extending the input–output energy balance methodology in agriculture through cluster analysis

International Nuclear Information System (INIS)

Bojacá, Carlos Ricardo; Casilimas, Héctor Albeiro; Gil, Rodrigo; Schrevens, Eddie

2012-01-01

The input–output balance methodology has been applied to characterize the energy balance of agricultural systems. This study proposes to extend this methodology with the inclusion of multivariate analysis to reveal particular patterns in the energy use of a system. The objective was to demonstrate the usefulness of multivariate exploratory techniques to analyze the variability found in a farming system and, establish efficiency categories that can be used to improve the energy balance of the system. To this purpose an input–output analysis was applied to the major greenhouse tomato production area in Colombia. Individual energy profiles were built and the k-means clustering method was applied to the production factors. On average, the production system in the study zone consumes 141.8 GJ ha −1 to produce 96.4 GJ ha −1 , resulting in an energy efficiency of 0.68. With the k-means clustering analysis, three clusters of farmers were identified with energy efficiencies of 0.54, 0.67 and 0.78. The most energy efficient cluster grouped 56.3% of the farmers. It is possible to optimize the production system by improving the management practices of those with the lowest energy use efficiencies. Multivariate analysis techniques demonstrated to be a complementary pathway to improve the energy efficiency of a system. -- Highlights: ► An input–output energy balance was estimated for greenhouse tomatoes in Colombia. ► We used the k-means clustering method to classify growers based on their energy use. ► Three clusters of growers were found with energy efficiencies of 0.54, 0.67 and 0.78. ► Overall system optimization is possible by improving the energy use of the less efficient.
A Historical Approach to Clustering in Emerging Economies

DEFF Research Database (Denmark)

Giacomin, Valeria

of external factors. Indeed, researchers have explained clusters as self-contained entities and reduced their success to local exceptionality. In contrast, emerging literature has shown that clusters are integrated in broader structures beyond their location and are rather building blocks of today’s global...... economy. The working paper goes on to present two historical cases from the global south to explain how clusters work as major tools for international business. Particularly in the developing world, multinationals have used clusters as platforms for channeling foreign investment, knowledge, and imported...... inputs. The study concludes by stressing the importance of using historical evidence and data to look at clusters as agglomerations of actors and companies operating not just at the local level but across broader global networks. In doing so the historical perspective provides explanations lacking...
Multi-ASIP Platform Synthesis for Real-Time Applications

DEFF Research Database (Denmark)

Micconi, Laura; Gangadharan, Deepak; Pop, Paul

2013-01-01

In this paper we are interested in deriving a distributed platform, composed of heterogeneous processing elements, targeted to applications that have strict timing constraints. We consider that the platform may use multiple Application Specific Instruction Set Processors (ASIPs). An ASIP...... is synthesized and tuned for a specific set of tasks (i.e., a task cluster). During design space exploration (DSE), we evaluate each platform solution visited in terms of its cost and performance, i.e., its ability to execute the applications such that they meet their timing constraints. To determine...... if the applications are schedulable, we have to know the worst-case execution time (WCET) of each task. However, we can determine the WCETs only after the ASIPs are synthesized, which is time consuming and therefore cannot be done during DSE. To address this circular dependency (the ASIPs depend on the task...
TECHNOLOGICAL PLATFORMS IN RUSSIAN NANOINDUSTRY: PROPLEMS AND PROSPECTS OF DEVELOPMENT

Directory of Open Access Journals (Sweden)

Oleg Inshakov

2017-12-01

Full Text Available The article discloses the economic content of technological platform (TP as a system of transactional relations of its participants, which provides the most beneficial interactions on the basis of a common goal and unity of interests. TP can generate clusters corresponding to its profile, based on common basic technologies. Reduction of the meaning of TPs and clusters to the notion of an “instrument” of economic policy hinders the disclosure of their economic content, which becomes a relevant theoretical problem and a barrier to the effective use of TPs in nanoindustry. Creation of nanotechnology platforms didn’t become an independent direction, however, RUSNANO became the coordinator and participant of 6 TPs. With RUSNANO participation 6 innovative clusters have been created, and 5 more within 10 years. Further development of Russian nanoindustry cannot be related only to clustering at macrolevel, but involves horizontal integration of nanotechnologies into various country’s specific TPs and vertical integration into TPs and clusters at the EAEU (Eurasian Economic Union scale. Nanotechnology projects financing in Eurasian TPs is attributed to the most complicated problems of their functioning, taking into account the insufficient rates of integration in the EAEU. The article argues the feasibility of financing Eurasian nanoindustry TPs on the basis of public-private partnership mechanism and the need to shift from its budgetary part to extra-budgetary as the “life cycle” of the TP develops, and its institutional and organizational maturity is increased. This will allow Russian and Eurasian TPs to realize their significant potential of resourcing the domestic nanoindustry development.
Industry Platforms and New Industrial Policy in Russia

Directory of Open Access Journals (Sweden)

Svetlana V. Orekhova

2017-12-01

Full Text Available The article aims at clarifying Russian industrial policy as a result of business models changing and market reconstruction. The research bass on the hypothesis that the choice of the industrial policy administrative measures depends on the object of management. It is important to link government regulation of markets with corporate strategies. We reveal that modern economic systems based on using of electronic technologies, big data, and innovative activity. Technological platforms as the single organizational and economic mechanisms affect economic systems very much. We study the content and main characteristics of a business model, is called the "technological platform". We also identify the main required changes in industrial policy. We analyze the matching between scientific and technological scenarios of the Russian economic development and the technological platform. There are two areas of the new industrial policy. They are: a multi-sectoral approach to regulating and improving the quality of the national institutional environment. The industrial and cluster management approaches are inefficient in modern conditions. There is a need for each platform of its technological development scenario. We also clarify the role of the state in the functioning of technological platforms.

ACCURACY ANALYSIS OF A LOW-COST PLATFORM FOR POSITIONING AND NAVIGATION

Directory of Open Access Journals (Sweden)

S. Hofmann

2012-07-01

Full Text Available This paper presents an accuracy analysis of a platform based on low-cost components for landmark-based navigation intended for research and teaching purposes. The proposed platform includes a LEGO MINDSTORMS NXT 2.0 kit, an Android-based Smartphone as well as a compact laser scanner Hokuyo URG-04LX. The robot is used in a small indoor environment, where GNSS is not available. Therefore, a landmark map was produced in advance, with the landmark positions provided to the robot. All steps of procedure to set up the platform are shown. The main focus of this paper is the reachable positioning accuracy, which was analyzed in this type of scenario depending on the accuracy of the reference landmarks and the directional and distance measuring accuracy of the laser scanner. Several experiments were carried out, demonstrating the practically achievable positioning accuracy. To evaluate the accuracy, ground truth was acquired using a total station. These results are compared to the theoretically achievable accuracies and the laser scanner’s characteristics.
The quantitative analysis of silicon carbide surface smoothing by Ar and Xe cluster ions

Science.gov (United States)

Ieshkin, A. E.; Kireev, D. S.; Ermakov, Yu. A.; Trifonov, A. S.; Presnov, D. E.; Garshev, A. V.; Anufriev, Yu. V.; Prokhorova, I. G.; Krupenin, V. A.; Chernysh, V. S.

2018-04-01

The gas cluster ion beam technique was used for the silicon carbide crystal surface smoothing. The effect of processing by two inert cluster ions, argon and xenon, was quantitatively compared. While argon is a standard element for GCIB, results for xenon clusters were not reported yet. Scanning probe microscopy and high resolution transmission electron microscopy techniques were used for the analysis of the surface roughness and surface crystal layer quality. The gas cluster ion beam processing results in surface relief smoothing down to average roughness about 1 nm for both elements. It was shown that xenon as the working gas is more effective: sputtering rate for xenon clusters is 2.5 times higher than for argon at the same beam energy. High resolution transmission electron microscopy analysis of the surface defect layer gives values of 7 ± 2 nm and 8 ± 2 nm for treatment with argon and xenon clusters.
Infrared dust emission from globular clusters

International Nuclear Information System (INIS)

Angeletti, L.; Capuzzo-Dolcetta, R.; Giannone, P.; Blanco, A.; Bussoletti, E.

1982-01-01

The implications of the presence of a central cloud in the cores of globular clusters were investigated recently. A possible mechanism of confinement of dust in the central region of our cluster models was also explored. The grain temperature and infrared emission have now been computed for rather realistic grain compositions. The grain components were assumed to be graphite and/or silicates. The central clouds turned out to be roughly isothermal. The wavelengths of maximum emission came out to be larger than 20 μm in all studied cases. An application of the theoretical results to five globular clusters showed that the predictable infrared emission for 47 Tuc, M4 and M22 should be detectable by means of present instrumentation aboard flying platforms. (author)
Infrared dust emission from globular clusters

Energy Technology Data Exchange (ETDEWEB)

Angeletti, L; Capuzzo-Dolcetta, R; Giannone, P. (Rome Univ. (Italy). Osservatorio Astronomico); Blanco, A; Bussoletti, E [Lecce Univ. (Italy). Ist. di Fisica

1982-05-01

The implications of the presence of a central cloud in the cores of globular clusters were investigated recently. A possible mechanism of confinement of dust in the central region of our cluster models was also explored. The grain temperature and infrared emission have now been computed for rather realistic grain compositions. The grain components were assumed to be graphite and/or silicates. The central clouds turned out to be roughly isothermal. The wavelengths of maximum emission came out to be larger than 20 ..mu..m in all studied cases. An application of the theoretical results to five globular clusters showed that the predictable infrared emission for 47 Tuc, M4 and M22 should be detectable by means of present instrumentation aboard flying platforms.
An ESL Approach for Energy Consumption Analysis of Cache Memories in SoC Platforms

Directory of Open Access Journals (Sweden)

Abel G. Silva-Filho

2011-01-01

Full Text Available The design of complex circuits as SoCs presents two great challenges to designers. One is the speeding up of system functionality modeling and the second is the implementation of the system in an architecture that meets performance and power consumption requirements. Thus, developing new high-level specification mechanisms for the reduction of the design effort with automatic architecture exploration is a necessity. This paper proposes an Electronic-System-Level (ESL approach for system modeling and cache energy consumption analysis of SoCs called PCacheEnergyAnalyzer. It uses as entry a high-level UML-2.0 profile model of the system and it generates a simulation model of a multicore platform that can be analyzed for cache tuning. PCacheEnergyAnalyzer performs static/dynamic energy consumption analysis of caches on platforms that may have different processors. Architecture exploration is achieved by letting designers choose different processors for platform generation and different mechanisms for cache optimization. PCacheEnergyAnalyzer has been validated with several applications of Mibench, Mediabench, and PowerStone benchmarks, and results show that it provides analysis with reduced simulation effort.
An integrated platform for biomolecule interaction analysis

Science.gov (United States)

Jan, Chia-Ming; Tsai, Pei-I.; Chou, Shin-Ting; Lee, Shu-Sheng; Lee, Chih-Kung

2013-02-01

We developed a new metrology platform which can detect real-time changes in both a phase-interrogation mode and intensity mode of a SPR (surface plasmon resonance). We integrated a SPR and ellipsometer to a biosensor chip platform to create a new biomolecular interaction measurement mechanism. We adopted a conductive ITO (indium-tinoxide) film to the bio-sensor platform chip to expand the dynamic range and improve measurement accuracy. The thickness of the conductive film and the suitable voltage constants were found to enhance performance. A circularly polarized ellipsometry configuration was incorporated into the newly developed platform to measure the label-free interactions of recombinant human C-reactive protein (CRP) with immobilized biomolecule target monoclonal human CRP antibody at various concentrations. CRP was chosen as it is a cardiovascular risk biomarker and is an acute phase reactant as well as a specific prognostic indicator for inflammation. We found that the sensitivity of a phaseinterrogation SPR is predominantly dependent on the optimization of the sample incidence angle. The effect of the ITO layer effective index under DC and AC effects as well as an optimal modulation were experimentally performed and discussed. Our experimental results showed that the modulated dynamic range for phase detection was 10E-2 RIU based on a current effect and 10E-4 RIU based on a potential effect of which a 0.55 (°/RIU) measurement was found by angular-interrogation. The performance of our newly developed metrology platform was characterized to have a higher sensitivity and less dynamic range when compared to a traditional full-field measurement system.
Multisource Images Analysis Using Collaborative Clustering

Directory of Open Access Journals (Sweden)

Pierre Gançarski

2008-04-01

Full Text Available The development of very high-resolution (VHR satellite imagery has produced a huge amount of data. The multiplication of satellites which embed different types of sensors provides a lot of heterogeneous images. Consequently, the image analyst has often many different images available, representing the same area of the Earth surface. These images can be from different dates, produced by different sensors, or even at different resolutions. The lack of machine learning tools using all these representations in an overall process constraints to a sequential analysis of these various images. In order to use all the information available simultaneously, we propose a framework where different algorithms can use different views of the scene. Each one works on a different remotely sensed image and, thus, produces different and useful information. These algorithms work together in a collaborative way through an automatic and mutual refinement of their results, so that all the results have almost the same number of clusters, which are statistically similar. Finally, a unique result is produced, representing a consensus among the information obtained by each clustering method on its own image. The unified result and the complementarity of the single results (i.e., the agreement between the clustering methods as well as the disagreement lead to a better understanding of the scene. The experiments carried out on multispectral remote sensing images have shown that this method is efficient to extract relevant information and to improve the scene understanding.
Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.

Science.gov (United States)

Ben-Sasson, Ayelet; Podoly, Tamar Yonit

2017-02-01

Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a
Mental State Talk Structure in Children’s Narratives: A Cluster Analysis

Directory of Open Access Journals (Sweden)

Giuliana Pinto

2017-01-01

Full Text Available This study analysed children’s Theory of Mind (ToM as assessed by mental state talk in oral narratives. We hypothesized that the children’s mental state talk in narratives has an underlying structure, with specific terms organized in clusters. Ninety-eight children attending the last year of kindergarten were asked to tell a story twice, at the beginning and at the end of the school year. Mental state talk was analysed by identifying terms and expressions referring to perceptual, physiological, emotional, willingness, cognitive, moral, and sociorelational states. The cluster analysis showed that children’s mental state talk is organized in two main clusters: perceptual states and affective states. Results from the study confirm the feasibility of narratives as an outlet to inquire mental state talk and offer a more fine-grained analysis of mental state talk structure.
The Assessment of Hydrogen Energy Systems for Fuel Cell Vehicles Using Principal Componenet Analysis and Cluster Analysis

DEFF Research Database (Denmark)

Ren, Jingzheng; Tan, Shiyu; Dong, Lichun

2012-01-01

and analysis of the hydrogen systems is meaningful for decision makers to select the best scenario. principal component analysis (PCA) has been used to evaluate the integrated performance of different hydrogen energy systems and select the best scenario, and hierarchical cluster analysis (CA) has been used...... for transportation of hydrogen, hydrogen gas tank for the storage of hydrogen at refueling stations, and gaseous hydrogen as power energy for fuel cell vehicles has been recognized as the best scenario. Also, the clustering results calculated by CA are consistent with those determined by PCA, denoting...
Common Factor Analysis Versus Principal Component Analysis: Choice for Symptom Cluster Research

Directory of Open Access Journals (Sweden)

Hee-Ju Kim, PhD, RN

2008-03-01

Conclusion: If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research, CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.
Outcome-Driven Cluster Analysis with Application to Microarray Data.

Directory of Open Access Journals (Sweden)

Jessie J Hsu

Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.
Use of the NetBeans Platform for NASA Robotic Conjunction Assessment Risk Analysis

Science.gov (United States)

Sabey, Nickolas J.

2014-01-01

The latest Java and JavaFX technologies are very attractive software platforms for customers involved in space mission operations such as those of NASA and the US Air Force. For NASA Robotic Conjunction Assessment Risk Analysis (CARA), the NetBeans platform provided an environment in which scalable software solutions could be developed quickly and efficiently. Both Java 8 and the NetBeans platform are in the process of simplifying CARA development in secure environments by providing a significant amount of capability in a single accredited package, where accreditation alone can account for 6-8 months for each library or software application. Capabilities either in use or being investigated by CARA include: 2D and 3D displays with JavaFX, parallelization with the new Streams API, and scalability through the NetBeans plugin architecture.
Detection of secondary structure elements in proteins by hydrophobic cluster analysis.

Science.gov (United States)

Woodcock, S; Mornon, J P; Henrissat, B

1992-10-01

Hydrophobic cluster analysis (HCA) is a protein sequence comparison method based on alpha-helical representations of the sequences where the size, shape and orientation of the clusters of hydrophobic residues are primarily compared. The effectiveness of HCA has been suggested to originate from its potential ability to focus on the residues forming the hydrophobic core of globular proteins. We have addressed the robustness of the bidimensional representation used for HCA in its ability to detect the regular secondary structure elements of proteins. Various parameters have been studied such as those governing cluster size and limits, the hydrophobic residues constituting the clusters as well as the potential shift of the cluster positions with respect to the position of the regular secondary structure elements. The following results have been found to support the alpha-helical bidimensional representation used in HCA: (i) there is a positive correlation (clearly above background noise) between the hydrophobic clusters and the regular secondary structure elements in proteins; (ii) the hydrophobic clusters are centred on the regular secondary structure elements; (iii) the pitch of the helical representation which gives the best correspondence is that of an alpha-helix. The correspondence between hydrophobic clusters and regular secondary structure elements suggests a way to implement variable gap penalties during the automatic alignment of protein sequences.
A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis

Directory of Open Access Journals (Sweden)

Huanhuan Li

2017-08-01

Full Text Available The Shipboard Automatic Identification System (AIS is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW, a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our
A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis.

Science.gov (United States)

Li, Huanhuan; Liu, Jingxian; Liu, Ryan Wen; Xiong, Naixue; Wu, Kefeng; Kim, Tai-Hoon

2017-08-04

The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with
Control of entanglement transitions in quantum spin clusters

Science.gov (United States)

Irons, Hannah R.; Quintanilla, Jorge; Perring, Toby G.; Amico, Luigi; Aeppli, Gabriel

2017-12-01

Quantum spin clusters provide a platform for the experimental study of many-body entanglement. Here we address a simple model of a single-molecule nanomagnet featuring N interacting spins in a transverse field. The field can control an entanglement transition (ET). We calculate the magnetization, low-energy gap, and neutron-scattering cross section and find that the ET has distinct signatures, detectable at temperatures as high as 5% of the interaction strength. The signatures are stronger for smaller clusters.
Person mobility in the design and analysis of cluster-randomized cohort prevention trials.

Science.gov (United States)

Vuchinich, Sam; Flay, Brian R; Aber, Lawrence; Bickman, Leonard

2012-06-01

Person mobility is an inescapable fact of life for most cluster-randomized (e.g., schools, hospitals, clinic, cities, state) cohort prevention trials. Mobility rates are an important substantive consideration in estimating the effects of an intervention. In cluster-randomized trials, mobility rates are often correlated with ethnicity, poverty and other variables associated with disparity. This raises the possibility that estimated intervention effects may generalize to only the least mobile segments of a population and, thus, create a threat to external validity. Such mobility can also create threats to the internal validity of conclusions from randomized trials. Researchers must decide how to deal with persons who leave study clusters during a trial (dropouts), persons and clusters that do not comply with an assigned intervention, and persons who enter clusters during a trial (late entrants), in addition to the persons who remain for the duration of a trial (stayers). Statistical techniques alone cannot solve the key issues of internal and external validity raised by the phenomenon of person mobility. This commentary presents a systematic, Campbellian-type analysis of person mobility in cluster-randomized cohort prevention trials. It describes four approaches for dealing with dropouts, late entrants and stayers with respect to data collection, analysis and generalizability. The questions at issue are: 1) From whom should data be collected at each wave of data collection? 2) Which cases should be included in the analyses of an intervention effect? and 3) To what populations can trial results be generalized? The conclusions lead to recommendations for the design and analysis of future cluster-randomized cohort prevention trials.
Characterizing Suicide in Toronto: An Observational Study and Cluster Analysis

Science.gov (United States)

Sinyor, Mark; Schaffer, Ayal; Streiner, David L

2014-01-01

Objective: To determine whether people who have died from suicide in a large epidemiologic sample form clusters based on demographic, clinical, and psychosocial factors. Method: We conducted a coroner’s chart review for 2886 people who died in Toronto, Ontario, from 1998 to 2010, and whose death was ruled as suicide by the Office of the Chief Coroner of Ontario. A cluster analysis using known suicide risk factors was performed to determine whether suicide deaths separate into distinct groups. Clusters were compared according to person- and suicide-specific factors. Results: Five clusters emerged. Cluster 1 had the highest proportion of females and nonviolent methods, and all had depression and a past suicide attempt. Cluster 2 had the highest proportion of people with a recent stressor and violent suicide methods, and all were married. Cluster 3 had mostly males between the ages of 20 and 64, and all had either experienced recent stressors, suffered from mental illness, or had a history of substance abuse. Cluster 4 had the youngest people and the highest proportion of deaths by jumping from height, few were married, and nearly one-half had bipolar disorder or schizophrenia. Cluster 5 had all unmarried people with no prior suicide attempts, and were the least likely to have an identified mental illness and most likely to leave a suicide note. Conclusions: People who die from suicide assort into different patterns of demographic, clinical, and death-specific characteristics. Identifying and studying subgroups of suicides may advance our understanding of the heterogeneous nature of suicide and help to inform development of more targeted suicide prevention strategies. PMID:24444321
FLOCK cluster analysis of mast cell event clustering by high-sensitivity flow cytometry predicts systemic mastocytosis.

Science.gov (United States)

Dorfman, David M; LaPlante, Charlotte D; Pozdnyakova, Olga; Li, Betty

2015-11-01

In our high-sensitivity flow cytometric approach for systemic mastocytosis (SM), we identified mast cell event clustering as a new diagnostic criterion for the disease. To objectively characterize mast cell gated event distributions, we performed cluster analysis using FLOCK, a computational approach to identify cell subsets in multidimensional flow cytometry data in an unbiased, automated fashion. FLOCK identified discrete mast cell populations in most cases of SM (56/75 [75%]) but only a minority of non-SM cases (17/124 [14%]). FLOCK-identified mast cell populations accounted for 2.46% of total cells on average in SM cases and 0.09% of total cells on average in non-SM cases (P < .0001) and were predictive of SM, with a sensitivity of 75%, a specificity of 86%, a positive predictive value of 76%, and a negative predictive value of 85%. FLOCK analysis provides useful diagnostic information for evaluating patients with suspected SM, and may be useful for the analysis of other hematopoietic neoplasms. Copyright© by the American Society for Clinical Pathology.

Cluster Analysis of International Information and Social Development.

Science.gov (United States)

Lau, Jesus

1990-01-01

Analyzes information activities in relation to socioeconomic characteristics in low, middle, and highly developed economies for the years 1960 and 1977 through the use of cluster analysis. Results of data from 31 countries suggest that information development is achieved mainly by countries that have also achieved social development. (26…
Transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis

Directory of Open Access Journals (Sweden)

Riccardi Giovanna

2009-03-01

Full Text Available Abstract Background The ESAT-6 (early secreted antigenic target, 6 kDa family collects small mycobacterial proteins secreted by Mycobacterium tuberculosis, particularly in the early phase of growth. There are 23 ESAT-6 family members in M. tuberculosis H37Rv. In a previous work, we identified the Zur- dependent regulation of five proteins of the ESAT-6/CFP-10 family (esxG, esxH, esxQ, esxR, and esxS. esxG and esxH are part of ESAT-6 cluster 3, whose expression was already known to be induced by iron starvation. Results In this research, we performed EMSA experiments and transcriptional analysis of ESAT-6 cluster 3 in Mycobacterium smegmatis (msmeg0615-msmeg0625 and M. tuberculosis. In contrast to what we had observed in M. tuberculosis, we found that in M. smegmatis ESAT-6 cluster 3 responds only to iron and not to zinc. In both organisms we identified an internal promoter, a finding which suggests the presence of two transcriptional units and, by consequence, a differential expression of cluster 3 genes. We compared the expression of msmeg0615 and msmeg0620 in different growth and stress conditions by means of relative quantitative PCR. The expression of msmeg0615 and msmeg0620 genes was essentially similar; they appeared to be repressed in most of the tested conditions, with the exception of acid stress (pH 4.2 where msmeg0615 was about 4-fold induced, while msmeg0620 was repressed. Analysis revealed that in acid stress conditions M. tuberculosis rv0282 gene was 3-fold induced too, while rv0287 induction was almost insignificant. Conclusion In contrast with what has been reported for M. tuberculosis, our results suggest that in M. smegmatis only IdeR-dependent regulation is retained, while zinc has no effect on gene expression. The role of cluster 3 in M. tuberculosis virulence is still to be defined; however, iron- and zinc-dependent expression strongly suggests that cluster 3 is highly expressed in the infective process, and that the cluster
Platforms for Single-Cell Collection and Analysis

Directory of Open Access Journals (Sweden)

Lukas Valihrach

2018-03-01

Full Text Available Single-cell analysis has become an established method to study cell heterogeneity and for rare cell characterization. Despite the high cost and technical constraints, applications are increasing every year in all fields of biology. Following the trend, there is a tremendous development of tools for single-cell analysis, especially in the RNA sequencing field. Every improvement increases sensitivity and throughput. Collecting a large amount of data also stimulates the development of new approaches for bioinformatic analysis and interpretation. However, the essential requirement for any analysis is the collection of single cells of high quality. The single-cell isolation must be fast, effective, and gentle to maintain the native expression profiles. Classical methods for single-cell isolation are micromanipulation, microdissection, and fluorescence-activated cell sorting (FACS. In the last decade several new and highly efficient approaches have been developed, which not just supplement but may fully replace the traditional ones. These new techniques are based on microfluidic chips, droplets, micro-well plates, and automatic collection of cells using capillaries, magnets, an electric field, or a punching probe. In this review we summarize the current methods and developments in this field. We discuss the advantages of the different commercially available platforms and their applicability, and also provide remarks on future developments.
Platforms for Single-Cell Collection and Analysis.

Science.gov (United States)

Valihrach, Lukas; Androvic, Peter; Kubista, Mikael

2018-03-11

Single-cell analysis has become an established method to study cell heterogeneity and for rare cell characterization. Despite the high cost and technical constraints, applications are increasing every year in all fields of biology. Following the trend, there is a tremendous development of tools for single-cell analysis, especially in the RNA sequencing field. Every improvement increases sensitivity and throughput. Collecting a large amount of data also stimulates the development of new approaches for bioinformatic analysis and interpretation. However, the essential requirement for any analysis is the collection of single cells of high quality. The single-cell isolation must be fast, effective, and gentle to maintain the native expression profiles. Classical methods for single-cell isolation are micromanipulation, microdissection, and fluorescence-activated cell sorting (FACS). In the last decade several new and highly efficient approaches have been developed, which not just supplement but may fully replace the traditional ones. These new techniques are based on microfluidic chips, droplets, micro-well plates, and automatic collection of cells using capillaries, magnets, an electric field, or a punching probe. In this review we summarize the current methods and developments in this field. We discuss the advantages of the different commercially available platforms and their applicability, and also provide remarks on future developments.
The Flemish frozen-vegetable industry as an example of cluster analysis : Flanders Vegetable Valley

NARCIS (Netherlands)

Vanhaverbeke, W.P.M.; Larosse, J.; Winnen, W.; Hulsink, W.; Dons, J.J.M.

2008-01-01

In this contribution we present a strategic analysis of the cluster dynamics in the frozen-vegetable industry in Flanders (Belgium)1. The main purpose of this case is twofold. First, we determine the added value of using data about customer and supplier relationships in cluster analysis. Second, we
Fiji: an open-source platform for biological-image analysis.

Science.gov (United States)

Schindelin, Johannes; Arganda-Carreras, Ignacio; Frise, Erwin; Kaynig, Verena; Longair, Mark; Pietzsch, Tobias; Preibisch, Stephan; Rueden, Curtis; Saalfeld, Stephan; Schmid, Benjamin; Tinevez, Jean-Yves; White, Daniel James; Hartenstein, Volker; Eliceiri, Kevin; Tomancak, Pavel; Cardona, Albert

2012-06-28

Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.
Resource Planning for SPARQL Query Execution on Data Sharing Platforms

DEFF Research Database (Denmark)

Hagedorn, Stefan; Hose, Katja; Sattler, Kai-Uwe

2014-01-01

To increase performance, data sharing platforms often make use of clusters of nodes where certain tasks can be executed in parallel. Resource planning and especially deciding how many processors should be chosen to exploit parallel processing is complex in such a setup as increasing the number...
Factor-cluster analysis and enrichment study of Mangrove sediments - An example from Mengkabong, Sabah

International Nuclear Information System (INIS)

Praveena, S.M.; Ahmed, A.; Radojevic, M.; Mohd Harun Abdullah; Aris, A.Z.

2007-01-01

This paper examines the tidal effects in the sediment of Mengkabong mangrove forest, Sabah. Generally, all the studied parameters showed high value at high tide compared to low tide. Factor-cluster analyses were adopted to allow the identification of controlling factors at high and low tides. Factor analysis extracted six controlling factors at high tide and seven controlling factors at low tide. Cluster analysis extracted two district clusters at high and low tides. The study showed that factor-cluster analysis application is a useful tool to single out the controlling factors at high and low tides. this will provide a basis for describing the tidal effects in the mangrove sediment. The salinity and electrical conductivity clusters as well as component loadings at high and low tide explained the tidal process where there is high contribution of seawater to mangrove sediments that controls the sediment chemistry. The geo accumulation index (T geo ) values suggest the mangrove sediments are having background concentrations for Al, Cu, Fe and Zn and unpolluted for Pb. (author)
A Short Survey on the State of the Art in Architectures and Platforms for Large Scale Data Analysis and Knowledge Discovery from Data

Energy Technology Data Exchange (ETDEWEB)

Begoli, Edmon [ORNL

2012-01-01

Intended as a survey for practicing architects and researchers seeking an overview of the state-of-the-art architectures for data analysis, this paper provides an overview of the emerg- ing data management and analytic platforms including par- allel databases, Hadoop-based systems, High Performance Computing (HPC) platforms and platforms popularly re- ferred to as NoSQL platforms. Platforms are presented based on their relevance, analysis they support and the data organization model they support.
A novel multiplex bead-based platform highlights the diversity of extracellular vesicles.

Science.gov (United States)

Koliha, Nina; Wiencek, Yvonne; Heider, Ute; Jüngst, Christian; Kladt, Nikolay; Krauthäuser, Susanne; Johnston, Ian C D; Bosio, Andreas; Schauss, Astrid; Wild, Stefan

2016-01-01

The surface protein composition of extracellular vesicles (EVs) is related to the originating cell and may play a role in vesicle function. Knowledge of the protein content of individual EVs is still limited because of the technical challenges to analyse small vesicles. Here, we introduce a novel multiplex bead-based platform to investigate up to 39 different surface markers in one sample. The combination of capture antibody beads with fluorescently labelled detection antibodies allows the analysis of EVs that carry surface markers recognized by both antibodies. This new method enables an easy screening of surface markers on populations of EVs. By combining different capture and detection antibodies, additional information on relative expression levels and potential vesicle subpopulations is gained. We also established a protocol to visualize individual EVs by stimulated emission depletion (STED) microscopy. Thereby, markers on single EVs can be detected by fluorophore-conjugated antibodies. We used the multiplex platform and STED microscopy to show for the first time that NK cell-derived EVs and platelet-derived EVs are devoid of CD9 or CD81, respectively, and that EVs isolated from activated B cells comprise different EV subpopulations. We speculate that, according to our STED data, tetraspanins might not be homogenously distributed but may mostly appear as clusters on EV subpopulations. Finally, we demonstrate that EV mixtures can be separated by magnetic beads and analysed subsequently with the multiplex platform. Both the multiplex bead-based platform and STED microscopy revealed subpopulations of EVs that have been indistinguishable by most analysis tools used so far. We expect that an in-depth view on EV heterogeneity will contribute to our understanding of different EVs and functions.
Convex Clustering: An Attractive Alternative to Hierarchical Clustering

Science.gov (United States)

Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

2015-01-01

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340
Applying Clustering to Statistical Analysis of Student Reasoning about Two-Dimensional Kinematics

Science.gov (United States)

Springuel, R. Padraic; Wittman, Michael C.; Thompson, John R.

2007-01-01

We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and…
Advances in the development of the Mexican platform for analysis and design of nuclear reactors: AZTLAN Platform; Avances en el desarrollo de la plataforma mexicana para analisis y diseno de reactores nucleares: AZTLAN Platform

Energy Technology Data Exchange (ETDEWEB)

Gomez T, A. M.; Puente E, F. [ININ, Carretera Mexico-Toluca s/n, 52750 Ocoyoacac, Estado de Mexico (Mexico); Del Valle G, E. [IPN, Escuela Superior de Fisica y Matematicas, Av. IPN s/n, 07738 Ciudad de Mexico (Mexico); Francois L, J. L. [UNAM, Facultad de Ingenieria, Departamento de Sistemas Energeticos, Paseo Cuauhnahuac 8532, Col. Progreso, 62550 Jiutepec, Morelos (Mexico); Espinosa P, G., E-mail: armando.gomez@inin.gob.mx [Universidad Autonoma Metropolitana, Unidad Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, 09340 Ciudad de Mexico (Mexico)

2017-09-15

The AZTLAN platform project: development of a Mexican platform for the analysis and design of nuclear reactors, financed by the SENER-CONACYT Energy Sustain ability Fund, was approved in early 2014 and formally began at the end of that year. It is a national project led by the Instituto Nacional de Investigaciones Nucleares (ININ) and with the collaboration of Instituto Politecnico Nacional (IPN), the Universidad Autonoma Metropolitana (UAM) and Universidad Nacional Autonoma de Mexico (UNAM) as part of the development team and with the participation of the Laguna Verde Nuclear Power Plant, the National Commission of Nuclear Safety and Safeguards, the Ministry of Energy and the Karlsruhe Institute of Technology (Kit, Germany) as part of the user group. The general objective of the project is to modernize, improve and integrate the neutronic, thermo-hydraulic and thermo-mechanical codes, developed in Mexican institutions, in an integrated platform, developed and maintained by Mexican experts for the benefit of Mexican institutions. Two years into the process, important steps have been taken that have consolidated the platform. The main results of these first two years have been presented in different national and international forums. In this congress, some of the most recent results that have been implemented in the platform codes are shown in more detail. The current status of the platform from a more executive view point is summarized in this paper. (Author)
The CERN analysis facility-a PROOF cluster for day-one physics analysis

International Nuclear Information System (INIS)

Grosse-Oetringhaus, J F

2008-01-01

ALICE (A Large Ion Collider Experiment) at the LHC plans to use a PROOF cluster at CERN (CAF - CERN Analysis Facility) for analysis. The system is especially aimed at the prototyping phase of analyses that need a high number of development iterations and thus require a short response time. Typical examples are the tuning of cuts during the development of an analysis as well as calibration and alignment. Furthermore, the use of an interactive system with very fast response will allow ALICE to extract physics observables out of first data quickly. An additional use case is fast event simulation and reconstruction. A test setup consisting of 40 machines is used for evaluation since May 2006. The PROOF system enables the parallel processing and xrootd the access to files distributed on the test cluster. An automatic staging system for files either catalogued in the ALICE file catalog or stored in the CASTOR mass storage system has been developed. The current setup and ongoing development towards disk quotas and CPU fairshare are described. Furthermore, the integration of PROOF into ALICE's software framework (AliRoot) is discussed
Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions.

Science.gov (United States)

Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey

2017-09-01

We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians
GibbsCluster: unsupervised clustering and alignment of peptide sequences

DEFF Research Database (Denmark)

Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten

2017-01-01

motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large......-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0....
High-dimensional cluster analysis with the Masked EM Algorithm

Science.gov (United States)

Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.

2014-01-01

Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
Phenotypes of asthma in low-income children and adolescents: cluster analysis.

Science.gov (United States)

Cabral, Anna Lucia Barros; Sousa, Andrey Wirgues; Mendes, Felipe Augusto Rodrigues; Carvalho, Celso Ricardo Fernandes de

2017-01-01

Studies characterizing asthma phenotypes have predominantly included adults or have involved children and adolescents in developed countries. Therefore, their applicability in other populations, such as those of developing countries, remains indeterminate. Our objective was to determine how low-income children and adolescents with asthma in Brazil are distributed across a cluster analysis. We included 306 children and adolescents (6-18 years of age) with a clinical diagnosis of asthma and under medical treatment for at least one year of follow-up. At enrollment, all the patients were clinically stable. For the cluster analysis, we selected 20 variables commonly measured in clinical practice and considered important in defining asthma phenotypes. Variables with high multicollinearity were excluded. A cluster analysis was applied using a twostep agglomerative test and log-likelihood distance measure. Three clusters were defined for our population. Cluster 1 (n = 94) included subjects with normal pulmonary function, mild eosinophil inflammation, few exacerbations, later age at asthma onset, and mild atopy. Cluster 2 (n = 87) included those with normal pulmonary function, a moderate number of exacerbations, early age at asthma onset, more severe eosinophil inflammation, and moderate atopy. Cluster 3 (n = 108) included those with poor pulmonary function, frequent exacerbations, severe eosinophil inflammation, and severe atopy. Asthma was characterized by the presence of atopy, number of exacerbations, and lung function in low-income children and adolescents in Brazil. The many similarities with previous cluster analyses of phenotypes indicate that this approach shows good generalizability. Estudos que caracterizam fenótipos de asma predominantemente incluem adultos ou foram realizados em crianças e adolescentes de países desenvolvidos; portanto, sua aplicabilidade em outras populações, tais como as de países em desenvolvimento, permanece indeterminada. Nosso
Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

Directory of Open Access Journals (Sweden)

Xiao-Juan Jiang

Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles
Planetary-Scale Geospatial Data Analysis Techniques in Google's Earth Engine Platform (Invited)

Science.gov (United States)

Hancher, M.

2013-12-01

Geoscientists have more and more access to new tools for large-scale computing. With any tool, some tasks are easy and other tasks hard. It is natural to look to new computing platforms to increase the scale and efficiency of existing techniques, but there is a more exiting opportunity to discover and develop a new vocabulary of fundamental analysis idioms that are made easy and effective by these new tools. Google's Earth Engine platform is a cloud computing environment for earth data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog includes a nearly complete archive of scenes from Landsat 4, 5, 7, and 8 that have been processed by the USGS, as well as a wide variety of other remotely-sensed and ancillary data products. Earth Engine supports a just-in-time computation model that enables real-time preview during algorithm development and debugging as well as during experimental data analysis and open-ended data exploration. Data processing operations are performed in parallel across many computers in Google's datacenters. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, resampling, and associating image metadata with pixel data. Early applications of Earth Engine have included the development of Google's global cloud-free fifteen-meter base map and global multi-decadal time-lapse animations, as well as numerous large and small experimental analyses by scientists from a range of academic, government, and non-governmental institutions, working in a wide variety of application areas including forestry, agriculture, urban mapping, and species habitat modeling. Patterns in the successes and failures of these early efforts have begun to emerge, sketching the outlines of a new set of simple and effective approaches to geospatial data analysis.

expVIP: a Customizable RNA-seq Data Analysis and Visualization Platform.

Science.gov (United States)

Borrill, Philippa; Ramirez-Gonzalez, Ricardo; Uauy, Cristobal

2016-04-01

The majority of transcriptome sequencing (RNA-seq) expression studies in plants remain underutilized and inaccessible due to the use of disparate transcriptome references and the lack of skills and resources to analyze and visualize these data. We have developed expVIP, an expression visualization and integration platform, which allows easy analysis of RNA-seq data combined with an intuitive and interactive interface. Users can analyze public and user-specified data sets with minimal bioinformatics knowledge using the expVIP virtual machine. This generates a custom Web browser to visualize, sort, and filter the RNA-seq data and provides outputs for differential gene expression analysis. We demonstrate expVIP's suitability for polyploid crops and evaluate its performance across a range of biologically relevant scenarios. To exemplify its use in crop research, we developed a flexible wheat (Triticum aestivum) expression browser (www.wheat-expression.com) that can be expanded with user-generated data in a local virtual machine environment. The open-access expVIP platform will facilitate the analysis of gene expression data from a wide variety of species by enabling the easy integration, visualization, and comparison of RNA-seq data across experiments. © 2016 American Society of Plant Biologists. All Rights Reserved.
Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.

Science.gov (United States)

Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

2017-09-27

A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.
Language Learner Motivational Types: A Cluster Analysis Study

Science.gov (United States)

Papi, Mostafa; Teimouri, Yasser

2014-01-01

The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…
Kameleon Live: An Interactive Cloud Based Analysis and Visualization Platform for Space Weather Researchers

Science.gov (United States)

Pembroke, A. D.; Colbert, J. A.

2015-12-01

The Community Coordinated Modeling Center (CCMC) provides hosting for many of the simulations used by the space weather community of scientists, educators, and forecasters. CCMC users may submit model runs through the Runs on Request system, which produces static visualizations of model output in the browser, while further analysis may be performed off-line via Kameleon, CCMC's cross-language access and interpolation library. Off-line analysis may be suitable for power-users, but storage and coding requirements present a barrier to entry for non-experts. Moreover, a lack of a consistent framework for analysis hinders reproducibility of scientific findings. To that end, we have developed Kameleon Live, a cloud based interactive analysis and visualization platform. Kameleon Live allows users to create scientific studies built around selected runs from the Runs on Request database, perform analysis on those runs, collaborate with other users, and disseminate their findings among the space weather community. In addition to showcasing these novel collaborative analysis features, we invite feedback from CCMC users as we seek to advance and improve on the new platform.
Platforms for Innovation and Internationalization

DEFF Research Database (Denmark)

Rasmussen, Erik Stavnsager; Petersen, Nicolaj Hannesbo

2017-01-01

The high-tech global startup has many challenges related to both innovation and internationalization. From a Danish cluster of Welfare Tech firms, eight innovative and international firms were selected and interviewed. Such firms typically have to be agile and operate in virtual networks in almost...... all parts of their value chains. This article contributes to the understanding of how innovation and internationalization to a great extent are interlinked. The firms have developed a core product or service offering, which the firms often describe as “a platform”. Around the platform, they develop...
Open innovation in health care: analysis of an open health platform.

Science.gov (United States)

Bullinger, Angelika C; Rass, Matthias; Adamczyk, Sabrina; Moeslein, Kathrin M; Sohn, Stefan

2012-05-01

Today, integration of the public in research and development in health care is seen as essential for the advancement of innovation. This is a paradigmatic shift away from the traditional assumption that solely health care professionals are able to devise, develop, and disseminate novel concepts and solutions in health care. The present study builds on research in the field of open innovation to investigate the adoption of an open health platform by patients, care givers, physicians, family members, and the interested public. Results suggest that open innovation practices in health care lead to interesting innovation outcomes and are well accepted by participants. During the first three months, 803 participants of the open health platform submitted challenges and solutions and intensively communicated by exchanging 1454 personal messages and 366 comments. Analysis of communication content shows that empathic support and exchange of information are important elements of communication on the platform. The study presents first evidence for the suitability of open innovation practices to integrate the general public in health care research in order to foster both innovation outcomes and empathic support. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Proposal for a Mini Wireless Force Platform for Human Gait Analysis

Directory of Open Access Journals (Sweden)

Giovani PIFFER

2011-12-01

Full Text Available This paper aims to develop a mini wireless force platform placed in the shoe sole for analysis of human gait. The platform consists of a machined aluminum mechanical structure fixed into a sole, whose sensors are electrical resistance strain gages strategically cemented at the points of greatest deformation of the structure. The strain gages are configured as a ½ Wheatstone bridge connected to an amplifier for output signals and filtered by a signal conditioner. The signals are conditioned using a data acquisition board in conjunction with a graphical interface developed in LabVIEW. The static and dynamic behavior of the eight load cells was evaluated. Calibration at static pressures has shown that the eight load cells are linear within the usage range from 0 kgf to 45 kgf. The dynamic response has determined that the first vibration mode is around 1 kHz, indicating that the load cells have no resonance during the test. Three subjects carried out gait tests to examine the range of force platform use, and these tests demonstrated that the signals obtained are consistent with the classical references in this area.
Technology Provisioning in the Mobile Industry: a Strategic Clustering

Directory of Open Access Journals (Sweden)

Antonio Ghezzi

2012-07-01

Full Text Available This article develops a strategic clustering for Mobile Middleware Technology Providers (MMTPs, shedding light on the business models and the strategic positioning currently adopted by this actor typology. The paper combines a literature review and a multiple case study approach – 24 in‐depth cases based on 72 semi‐ structured interviews were performed – to deal with a significant and relatively new issue, i.e., the role of technology providers in the mobile value network. Through the creation of a system of strategic clustering matrices, four key business models currently adopted by MMTPs – “Pure Play”, “Full Asset”, “Third Parties Relationship‐focused” and “Platform and Content Management” – are identified, and insightful conclusions on the impact of this actor’s newly emerging influence on the market’s competitive dynamics are drawn. The framework created supports a wide set of mobile communications stakeholders – both incumbent and new entrants – in their decision making and strategy analysis process.
The use of a cluster analysis in across herd genetic evaluation for ...

African Journals Online (AJOL)

To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution ...
Paleomagnetism.org : An online multi-platform open source environment for paleomagnetic data analysis

NARCIS (Netherlands)

Koymans, Mathijs R.; Langereis, C.G.; Pastor-Galán, D.; van Hinsbergen, D.J.J.

2016-01-01

This contribution provides an overview of Paleomagnetism.org, an open-source, multi-platform online environment for paleomagnetic data analysis. Paleomagnetism.org provides an interactive environment where paleomagnetic data can be interpreted, evaluated, visualized, and exported. The
PepArML: A Meta-Search Peptide Identification Platform for Tandem Mass Spectra.

Science.gov (United States)

Edwards, Nathan J

2013-12-01

The PepArML meta-search peptide identification platform for tandem mass spectra provides a unified search interface to seven search engines; a robust cluster, grid, and cloud computing scheduler for large-scale searches; and an unsupervised, model-free, machine-learning-based result combiner, which selects the best peptide identification for each spectrum, estimates false-discovery rates, and outputs pepXML format identifications. The meta-search platform supports Mascot; Tandem with native, k-score and s-score scoring; OMSSA; MyriMatch; and InsPecT with MS-GF spectral probability scores—reformatting spectral data and constructing search configurations for each search engine on the fly. The combiner selects the best peptide identification for each spectrum based on search engine results and features that model enzymatic digestion, retention time, precursor isotope clusters, mass accuracy, and proteotypic peptide properties, requiring no prior knowledge of feature utility or weighting. The PepArML meta-search peptide identification platform often identifies two to three times more spectra than individual search engines at 10% FDR.
Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033

Science.gov (United States)

Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.

2018-04-01

Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.
Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

Directory of Open Access Journals (Sweden)

I. Crawford

2015-11-01

Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the
Patterns of Brucellosis Infection Symptoms in Azerbaijan: A Latent Class Cluster Analysis

Directory of Open Access Journals (Sweden)

Rita Ismayilova

2014-01-01

Full Text Available Brucellosis infection is a multisystem disease, with a broad spectrum of symptoms. We investigated the existence of clusters of infected patients according to their clinical presentation. Using national surveillance data from the Electronic-Integrated Disease Surveillance System, we applied a latent class cluster (LCC analysis on symptoms to determine clusters of brucellosis cases. A total of 454 cases reported between July 2011 and July 2013 were analyzed. LCC identified a two-cluster model and the Vuong-Lo-Mendell-Rubin likelihood ratio supported the cluster model. Brucellosis cases in the second cluster (19% reported higher percentages of poly-lymphadenopathy, hepatomegaly, arthritis, myositis, and neuritis and changes in liver function tests compared to cases of the first cluster. Patients in the second cluster had a severe brucellosis disease course and were associated with longer delay in seeking medical attention. Moreover, most of them were from Beylagan, a region focused on sheep and goat livestock production in south-central Azerbaijan. Patients in cluster 2 accounted for one-quarter of brucellosis cases and had a more severe clinical presentation. Delay in seeking medical care may explain severe illness. Future work needs to determine the factors that influence brucellosis case seeking and identify brucellosis species, particularly among cases from Beylagan.
IVSPlat 1.0: an integrated virtual screening platform with a molecular graphical interface.

Science.gov (United States)

Sun, Yin Xue; Huang, Yan Xin; Li, Feng Li; Wang, Hong Yan; Fan, Cong; Bao, Yong Li; Sun, Lu Guo; Ma, Zhi Qiang; Kong, Jun; Li, Yu Xin

2012-01-05

The virtual screening (VS) of lead compounds using molecular docking and pharmacophore detection is now an important tool in drug discovery. VS tasks typically require a combination of several software tools and a molecular graphics system. Thus, the integration of all the requisite tools in a single operating environment could reduce the complexity of running VS experiments. However, only a few freely available integrated software platforms have been developed. A free open-source platform, IVSPlat 1.0, was developed in this study for the management and automation of VS tasks. We integrated several VS-related programs into a molecular graphics system to provide a comprehensive platform for the solution of VS tasks based on molecular docking, pharmacophore detection, and a combination of both methods. This tool can be used to visualize intermediate and final results of the VS execution, while also providing a clustering tool for the analysis of VS results. A case study was conducted to demonstrate the applicability of this platform. IVSPlat 1.0 provides a plug-in-based solution for the management, automation, and visualization of VS tasks. IVSPlat 1.0 is an open framework that allows the integration of extra software to extend its functionality and modified versions can be freely distributed. The open source code and documentation are available at http://kyc.nenu.edu.cn/IVSPlat/.
Multivalent adhesion molecule 7 clusters act as signaling platform for host cellular GTPase activation and facilitate epithelial barrier dysfunction.

Directory of Open Access Journals (Sweden)

Jenson Lim

2014-09-01

Full Text Available Vibrio parahaemolyticus is an emerging bacterial pathogen which colonizes the gastrointestinal tract and can cause severe enteritis and bacteraemia. During infection, V. parahaemolyticus primarily attaches to the small intestine, where it causes extensive tissue damage and compromises epithelial barrier integrity. We have previously described that Multivalent Adhesion Molecule (MAM 7 contributes to initial attachment of V. parahaemolyticus to epithelial cells. Here we show that the bacterial adhesin, through multivalent interactions between surface-induced adhesin clusters and phosphatidic acid lipids in the host cell membrane, induces activation of the small GTPase RhoA and actin rearrangements in host cells. In infection studies with V. parahaemolyticus we further demonstrate that adhesin-triggered activation of the ROCK/LIMK signaling axis is sufficient to redistribute tight junction proteins, leading to a loss of epithelial barrier function. Taken together, these findings show an unprecedented mechanism by which an adhesin acts as assembly platform for a host cellular signaling pathway, which ultimately facilitates breaching of the epithelial barrier by a bacterial pathogen.
Semiparametric Bayesian analysis of accelerated failure time models with cluster structures.

Science.gov (United States)

Li, Zhaonan; Xu, Xinyi; Shen, Junshan

2017-11-10

In this paper, we develop a Bayesian semiparametric accelerated failure time model for survival data with cluster structures. Our model allows distributional heterogeneity across clusters and accommodates their relationships through a density ratio approach. Moreover, a nonparametric mixture of Dirichlet processes prior is placed on the baseline distribution to yield full distributional flexibility. We illustrate through simulations that our model can greatly improve estimation accuracy by effectively pooling information from multiple clusters, while taking into account the heterogeneity in their random error distributions. We also demonstrate the implementation of our method using analysis of Mayo Clinic Trial in Primary Biliary Cirrhosis. Copyright © 2017 John Wiley & Sons, Ltd.
Search of molecular ground state via genetic algorithm: Implementation on a hybrid SIMD-MIMD platform

International Nuclear Information System (INIS)

Pucello, N.; D'Agostino, G.; Pisacane, F.

1997-01-01

A genetic algorithm for the optimization of the ground-state structure of a metallic cluster has been developed and ported on a SIMD-MIMD parallel platform. The SIMD part of the parallel platform is represented by a Quadrics/APE100 consisting of 512 floating point units, while the MIMD part is formed by a cluster of workstations. The proposed algorithm is composed by a part where the genetic operators are applied to the elements of the population and a part which performs a further local relaxation and the fitness calculation via Molecular Dynamics. These parts have been implemented on the MIMD and on the SIMD part, respectively. Results have been compared to those generated by using Simulated Annealing
Development of a Terpenoid-Production Platform in Streptomyces reveromyceticus SN-593.

Science.gov (United States)

Khalid, Ammara; Takagi, Hiroshi; Panthee, Suresh; Muroi, Makoto; Chappell, Joe; Osada, Hiroyuki; Takahashi, Shunji

2017-12-15

Terpenoids represent the largest class of natural products, some of which are resources for pharmaceuticals, fragrances, and fuels. Generally, mass production of valuable terpenoid compounds is hampered by their low production levels in organisms and difficulty of chemical synthesis. Therefore, the development of microbial biosynthetic platforms represents an alternative approach. Although microbial terpenoid-production platforms have been established in Escherichia coli and yeast, an optimal platform has not been developed for Streptomyces species, despite the large capacity to produce secondary metabolites, such as polyketide compounds. To explore this potential, we constructed a terpenoid-biosynthetic platform in Streptomyces reveromyceticus SN-593. This strain is unique in that it harbors the mevalonate gene cluster enabling the production of furaquinocin, which can be controlled by the pathway specific regulator Fur22. We simultaneously expressed the mevalonate gene cluster and subsequent terpenoid-biosynthetic genes under the control of Fur22. To achieve improved fur22 gene expression, we screened promoters from S. reveromyceticus SN-593. Our results showed that the promoter associated with rvr2030 gene enabled production of 212 ± 20 mg/L botryococcene to levels comparable to those previously reported for other microbial hosts. Given that the rvr2030 gene encodes for an enzyme involved in the primary metabolism, these results suggest that optimized expression of terpenoid-biosynthetic genes with primary and secondary metabolism might be as important for high yields of terpenoid compounds as is the absolute expression level of a target gene(s).
Investigating the usefulness of a cluster-based trend analysis to detect visual field progression in patients with open-angle glaucoma.

Science.gov (United States)

Aoki, Shuichiro; Murata, Hiroshi; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Hirasawa, Kazunori; Shoji, Nobuyuki; Asaoka, Ryo

2017-12-01

To investigate the usefulness of the Octopus (Haag-Streit) EyeSuite's cluster trend analysis in glaucoma. Ten visual fields (VFs) with the Humphrey Field Analyzer (Carl Zeiss Meditec), spanning 7.7 years on average were obtained from 728 eyes of 475 primary open angle glaucoma patients. Mean total deviation (mTD) trend analysis and EyeSuite's cluster trend analysis were performed on various series of VFs (from 1st to 10th: VF1-10 to 6th to 10th: VF6-10). The results of the cluster-based trend analysis, based on different lengths of VF series, were compared against mTD trend analysis. Cluster-based trend analysis and mTD trend analysis results were significantly associated in all clusters and with all lengths of VF series. Between 21.2% and 45.9% (depending on VF series length and location) of clusters were deemed to progress when the mTD trend analysis suggested no progression. On the other hand, 4.8% of eyes were observed to progress using the mTD trend analysis when cluster trend analysis suggested no progression in any two (or more) clusters. Whole field trend analysis can miss local VF progression. Cluster trend analysis appears as robust as mTD trend analysis and useful to assess both sectorial and whole field progression. Cluster-based trend analyses, in particular the definition of two or more progressing cluster, may help clinicians to detect glaucomatous progression in a timelier manner than using a whole field trend analysis, without significantly compromising specificity. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

A Raspberry Pi Cluster Instrumented for Fine-Grained Power Measurement

Directory of Open Access Journals (Sweden)

Michael F. Cloutier

2016-09-01

Full Text Available Power consumption has become an increasingly important metric when building large supercomputing clusters. One way to reduce power usage in large clusters is to use low-power embedded processors rather than the more typical high-end server CPUs (central processing units. We investigate various power-related metrics for seventeen different embedded ARM development boards in order to judge the appropriateness of using them in a computing cluster. We then build a custom cluster out of Raspberry Pi boards, which is specially designed for per-node detailed power measurement. In addition to serving as an embedded cluster testbed, our cluster’s power measurement, visualization and thermal features make it an excellent low-cost platform for education and experimentation.
[Optimization of cluster analysis based on drug resistance profiles of MRSA isolates].

Science.gov (United States)

Tani, Hiroya; Kishi, Takahiko; Gotoh, Minehiro; Yamagishi, Yuka; Mikamo, Hiroshige

2015-12-01

We examined 402 methicillin-resistant Staphylococcus aureus (MRSA) strains isolated from clinical specimens in our hospital between November 19, 2010 and December 27, 2011 to evaluate the similarity between cluster analysis of drug susceptibility tests and pulsed-field gel electrophoresis (PFGE). The results showed that the 402 strains tested were classified into 27 PFGE patterns (151 subtypes of patterns). Cluster analyses of drug susceptibility tests with the cut-off distance yielding a similar classification capability showed favorable results--when the MIC method was used, and minimum inhibitory concentration (MIC) values were used directly in the method, the level of agreement with PFGE was 74.2% when 15 drugs were tested. The Unweighted Pair Group Method with Arithmetic mean (UPGMA) method was effective when the cut-off distance was 16. Using the SIR method in which susceptible (S), intermediate (I), and resistant (R) were coded as 0, 2, and 3, respectively, according to the Clinical and Laboratory Standards Institute (CLSI) criteria, the level of agreement with PFGE was 75.9% when the number of drugs tested was 17, the method used for clustering was the UPGMA, and the cut-off distance was 3.6. In addition, to assess the reproducibility of the results, 10 strains were randomly sampled from the overall test and subjected to cluster analysis. This was repeated 100 times under the same conditions. The results indicated good reproducibility of the results, with the level of agreement with PFGE showing a mean of 82.0%, standard deviation of 12.1%, and mode of 90.0% for the MIC method and a mean of 80.0%, standard deviation of 13.4%, and mode of 90.0% for the SIR method. In summary, cluster analysis for drug susceptibility tests is useful for the epidemiological analysis of MRSA.
MOLA: a bootable, self-configuring system for virtual screening using AutoDock4/Vina on computer clusters

Directory of Open Access Journals (Sweden)

Abreu Rui MV

2010-10-01

Full Text Available Abstract Background Virtual screening of small molecules using molecular docking has become an important tool in drug discovery. However, large scale virtual screening is time demanding and usually requires dedicated computer clusters. There are a number of software tools that perform virtual screening using AutoDock4 but they require access to dedicated Linux computer clusters. Also no software is available for performing virtual screening with Vina using computer clusters. In this paper we present MOLA, an easy-to-use graphical user interface tool that automates parallel virtual screening using AutoDock4 and/or Vina in bootable non-dedicated computer clusters. Implementation MOLA automates several tasks including: ligand preparation, parallel AutoDock4/Vina jobs distribution and result analysis. When the virtual screening project finishes, an open-office spreadsheet file opens with the ligands ranked by binding energy and distance to the active site. All results files can automatically be recorded on an USB-flash drive or on the hard-disk drive using VirtualBox. MOLA works inside a customized Live CD GNU/Linux operating system, developed by us, that bypass the original operating system installed on the computers used in the cluster. This operating system boots from a CD on the master node and then clusters other computers as slave nodes via ethernet connections. Conclusion MOLA is an ideal virtual screening tool for non-experienced users, with a limited number of multi-platform heterogeneous computers available and no access to dedicated Linux computer clusters. When a virtual screening project finishes, the computers can just be restarted to their original operating system. The originality of MOLA lies on the fact that, any platform-independent computer available can he added to the cluster, without ever using the computer hard-disk drive and without interfering with the installed operating system. With a cluster of 10 processors, and a
Cosmological analysis of galaxy clusters surveys in X-rays

International Nuclear Information System (INIS)

Clerc, N.

2012-01-01

Clusters of galaxies are the most massive objects in equilibrium in our Universe. Their study allows to test cosmological scenarios of structure formation with precision, bringing constraints complementary to those stemming from the cosmological background radiation, supernovae or galaxies. They are identified through the X-ray emission of their heated gas, thus facilitating their mapping at different epochs of the Universe. This report presents two surveys of galaxy clusters detected in X-rays and puts forward a method for their cosmological interpretation. Thanks to its multi-wavelength coverage extending over 10 sq. deg. and after one decade of expertise, the XMM-LSS allows a systematic census of clusters in a large volume of the Universe. In the framework of this survey, the first part of this report describes the techniques developed to the purpose of characterizing the detected objects. A particular emphasis is placed on the most distant ones (z ≥ 1) through the complementarity of observations in X-ray, optical and infrared bands. Then the X-CLASS survey is fully described. Based on XMM archival data, it provides a new catalogue of 800 clusters detected in X-rays. A cosmological analysis of this survey is performed thanks to 'CR-HR' diagrams. This new method self-consistently includes selection effects and scaling relations and provides a means to bypass the computation of individual cluster masses. Propositions are made for applying this method to future surveys as XMM-XXL and eRosita. (author) [fr
Flocking-based Document Clustering on the Graphics Processing Unit

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL; Patton, Robert M [ORNL; ST Charles, Jesse Lee [ORNL

2008-01-01

Abstract?Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. Each bird represents a single document and flies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to receive results in a reasonable amount of time. However, flocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have found increased performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefit the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NIVIDA? we developed a document flocking implementation to be run on the NIVIDA?GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3000 documents. The results of these tests were very significant. Performance gains ranged from three to nearly five times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.
Analysis of plasmaspheric plumes: CLUSTER and IMAGE observations

Directory of Open Access Journals (Sweden)

F. Darrouzet

2006-07-01

Full Text Available Plasmaspheric plumes have been routinely observed by CLUSTER and IMAGE. The CLUSTER mission provides high time resolution four-point measurements of the plasmasphere near perigee. Total electron density profiles have been derived from the electron plasma frequency identified by the WHISPER sounder supplemented, in-between soundings, by relative variations of the spacecraft potential measured by the electric field instrument EFW; ion velocity is also measured onboard these satellites. The EUV imager onboard the IMAGE spacecraft provides global images of the plasmasphere with a spatial resolution of 0.1 RE every 10 min; such images acquired near apogee from high above the pole show the geometry of plasmaspheric plumes, their evolution and motion. We present coordinated observations of three plume events and compare CLUSTER in-situ data with global images of the plasmasphere obtained by IMAGE. In particular, we study the geometry and the orientation of plasmaspheric plumes by using four-point analysis methods. We compare several aspects of plume motion as determined by different methods: (i inner and outer plume boundary velocity calculated from time delays of this boundary as observed by the wave experiment WHISPER on the four spacecraft, (ii drift velocity measured by the electron drift instrument EDI onboard CLUSTER and (iii global velocity determined from successive EUV images. These different techniques consistently indicate that plasmaspheric plumes rotate around the Earth, with their foot fully co-rotating, but with their tip rotating slower and moving farther out.
Confidence in Government and Attitudes toward Bribery: A Country-Cluster Analysis of Demographic and Religiosity Perspectives

Directory of Open Access Journals (Sweden)

Serkan Benk

2017-01-01

Full Text Available In this study, we try to classify the countries by the levels of confidence in government and attitudes toward accepting bribery by using the data of the sixth wave (2010–2014 of the World Values Survey (WVS. We are also interested in which demographic, attitudinal, and religiosity variables affect each class of countries. For these purposes cluster analysis, linear regression analysis, and ordered logistic regression analysis were used. The study found that countries could be grouped into two clusters which had varying levels of opposition to bribe taking and confidence in government. Another finding was that certain demographic, attitudinal, and religiosity variables that were significant in one cluster might not be significant in another cluster.
Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

Directory of Open Access Journals (Sweden)

Chao Zhang

2017-09-01

Full Text Available A wireless-powered sensor network (WPSN consisting of one hybrid access point (HAP, a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.
Performance Analysis of Unsupervised Clustering Methods for Brain Tumor Segmentation

Directory of Open Access Journals (Sweden)

Tushar H Jaware

2013-10-01

Full Text Available Medical image processing is the most challenging and emerging field of neuroscience. The ultimate goal of medical image analysis in brain MRI is to extract important clinical features that would improve methods of diagnosis & treatment of disease. This paper focuses on methods to detect & extract brain tumour from brain MR images. MATLAB is used to design, software tool for locating brain tumor, based on unsupervised clustering methods. K-Means clustering algorithm is implemented & tested on data base of 30 images. Performance evolution of unsupervised clusteringmethods is presented.
Identifying clinical course patterns in SMS data using cluster analysis

DEFF Research Database (Denmark)

Kent, Peter; Kongsted, Alice

2012-01-01

ABSTRACT: BACKGROUND: Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important...... showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there are alternative ways of managing SMS data and many different methods...
Cluster fusion algorithm: application to Lennard-Jones clusters

DEFF Research Database (Denmark)

Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

2006-01-01

paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
Cluster fusion algorithm: application to Lennard-Jones clusters

DEFF Research Database (Denmark)

Solov'yov, Ilia; Solov'yov, Andrey V.; Greiner, Walter

2008-01-01

paths up to the cluster size of 150 atoms. We demonstrate that in this way all known global minima structures of the Lennard-Jones clusters can be found. Our method provides an efficient tool for the calculation and analysis of atomic cluster structure. With its use we justify the magic number sequence......We present a new general theoretical framework for modelling the cluster structure and apply it to description of the Lennard-Jones clusters. Starting from the initial tetrahedral cluster configuration, adding new atoms to the system and absorbing its energy at each step, we find cluster growing...... for the clusters of noble gas atoms and compare it with experimental observations. We report the striking correspondence of the peaks in the dependence of the second derivative of the binding energy per atom on cluster size calculated for the chain of the Lennard-Jones clusters based on the icosahedral symmetry...
Statistical analysis of the spatial distribution of galaxies and clusters

International Nuclear Information System (INIS)

Cappi, Alberto

1993-01-01

This thesis deals with the analysis of the distribution of galaxies and clusters, describing some observational problems and statistical results. First chapter gives a theoretical introduction, aiming to describe the framework of the formation of structures, tracing the history of the Universe from the Planck time, t_p = 10"-"4"3 sec and temperature corresponding to 10"1"9 GeV, to the present epoch. The most usual statistical tools and models of the galaxy distribution, with their advantages and limitations, are described in chapter two. A study of the main observed properties of galaxy clustering, together with a detailed statistical analysis of the effects of selecting galaxies according to apparent magnitude or diameter, is reported in chapter three. Chapter four delineates some properties of groups of galaxies, explaining the reasons of discrepant results on group distributions. Chapter five is a study of the distribution of galaxy clusters, with different statistical tools, like correlations, percolation, void probability function and counts in cells; it is found the same scaling-invariant behaviour of galaxies. Chapter six describes our finding that rich galaxy clusters too belong to the fundamental plane of elliptical galaxies, and gives a discussion of its possible implications. Finally chapter seven reviews the possibilities offered by multi-slit and multi-fibre spectrographs, and I present some observational work on nearby and distant galaxy clusters. In particular, I show the opportunities offered by ongoing surveys of galaxies coupled with multi-object fibre spectrographs, focusing on the ESO Key Programme A galaxy redshift survey in the south galactic pole region to which I collaborate and on MEFOS, a multi-fibre instrument with automatic positioning. Published papers related to the work described in this thesis are reported in the last appendix. (author) [fr
Molecular-dynamics analysis of mobile helium cluster reactions near surfaces of plasma-exposed tungsten

Energy Technology Data Exchange (ETDEWEB)

Hu, Lin; Maroudas, Dimitrios, E-mail: maroudas@ecs.umass.edu [Department of Chemical Engineering, University of Massachusetts, Amherst, Massachusetts 01003-9303 (United States); Hammond, Karl D. [Department of Chemical Engineering, University of Missouri, Columbia, Missouri 65211 (United States); Wirth, Brian D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, Tennessee 37996 (United States)

2015-10-28

We report the results of a systematic atomic-scale analysis of the reactions of small mobile helium clusters (He{sub n}, 4 ≤ n ≤ 7) near low-Miller-index tungsten (W) surfaces, aiming at a fundamental understanding of the near-surface dynamics of helium-carrying species in plasma-exposed tungsten. These small mobile helium clusters are attracted to the surface and migrate to the surface by Fickian diffusion and drift due to the thermodynamic driving force for surface segregation. As the clusters migrate toward the surface, trap mutation (TM) and cluster dissociation reactions are activated at rates higher than in the bulk. TM produces W adatoms and immobile complexes of helium clusters surrounding W vacancies located within the lattice planes at a short distance from the surface. These reactions are identified and characterized in detail based on the analysis of a large number of molecular-dynamics trajectories for each such mobile cluster near W(100), W(110), and W(111) surfaces. TM is found to be the dominant cluster reaction for all cluster and surface combinations, except for the He{sub 4} and He{sub 5} clusters near W(100) where cluster partial dissociation following TM dominates. We find that there exists a critical cluster size, n = 4 near W(100) and W(111) and n = 5 near W(110), beyond which the formation of multiple W adatoms and vacancies in the TM reactions is observed. The identified cluster reactions are responsible for important structural, morphological, and compositional features in the plasma-exposed tungsten, including surface adatom populations, near-surface immobile helium-vacancy complexes, and retained helium content, which are expected to influence the amount of hydrogen re-cycling and tritium retention in fusion tokamaks.
Towards a Disruptive Digital Platform Model

DEFF Research Database (Denmark)

Kazan, Erol

that digital platforms leverage on three strategic design elements (i.e., business, architecture, and technology design) to create supportive conditions for facilitating disruption. To shed light on disruptive digital platforms, I opted for payment platforms as my empirical context and unit of analysis......Digital platforms are layered modular information technology architectures that support disruption. Digital platforms are particularly disruptive, as they facilitate the quick release of digital innovations that may replace established innovations. Yet, despite their support for disruption, we have...... not fully understood how such digital platforms can be strategically designed and configured to facilitate disruption. To that end, this thesis endeavors to unravel disruptive digital platforms from the supply perspective that are grounded on strategic digital platform design elements. I suggest...
WHY DO SOME NATIONS SUCCEED AND OTHERS FAIL IN INTERNATIONAL COMPETITION? FACTOR ANALYSIS AND CLUSTER ANALYSIS AT EUROPEAN LEVEL

Directory of Open Access Journals (Sweden)

Popa Ion

2015-07-01

Full Text Available As stated by Michael Porter (1998: 57, 'this is perhaps the most frequently asked economic question of our times.' However, a widely accepted answer is still missing. The aim of this paper is not to provide the BIG answer for such a BIG question, but rather to provide a different perspective on the competitiveness at the national level. In this respect, we followed a two step procedure, called “tandem analysis”. (OECD, 2008. First we employed a Factor Analysis in order to reveal the underlying factors of the initial dataset followed by a Cluster Analysis which aims classifying the 35 countries according to the main characteristics of competitiveness resulting from Factor Analysis. The findings revealed that clustering the 35 states after the first two factors: Smart Growth and Market Development, which recovers almost 76% of common variability of the twelve original variables, are highlighted four clusters as well as a series of useful information in order to analyze the characteristics of the four clusters and discussions on them.
Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

Science.gov (United States)

Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

2017-08-01

Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.
Clustering and Candidate Motif Detection in Exosomal miRNAs by Application of Machine Learning Algorithms.

Science.gov (United States)

Gaur, Pallavi; Chaturvedi, Anoop

2017-07-22

The clustering pattern and motifs give immense information about any biological data. An application of machine learning algorithms for clustering and candidate motif detection in miRNAs derived from exosomes is depicted in this paper. Recent progress in the field of exosome research and more particularly regarding exosomal miRNAs has led much bioinformatic-based research to come into existence. The information on clustering pattern and candidate motifs in miRNAs of exosomal origin would help in analyzing existing, as well as newly discovered miRNAs within exosomes. Along with obtaining clustering pattern and candidate motifs in exosomal miRNAs, this work also elaborates the usefulness of the machine learning algorithms that can be efficiently used and executed on various programming languages/platforms. Data were clustered and sequence candidate motifs were detected successfully. The results were compared and validated with some available web tools such as 'BLASTN' and 'MEME suite'. The machine learning algorithms for aforementioned objectives were applied successfully. This work elaborated utility of machine learning algorithms and language platforms to achieve the tasks of clustering and candidate motif detection in exosomal miRNAs. With the information on mentioned objectives, deeper insight would be gained for analyses of newly discovered miRNAs in exosomes which are considered to be circulating biomarkers. In addition, the execution of machine learning algorithms on various language platforms gives more flexibility to users to try multiple iterations according to their requirements. This approach can be applied to other biological data-mining tasks as well.
Weighted Clustering

DEFF Research Database (Denmark)

Ackerman, Margareta; Ben-David, Shai; Branzei, Simina

2012-01-01

We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights.We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both...... the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify...
Using MATLAB software with Tomcat server and Java platform for remote image analysis in pathology.

Science.gov (United States)

Markiewicz, Tomasz

2011-03-30

The Matlab software is a one of the most advanced development tool for application in engineering practice. From our point of view the most important is the image processing toolbox, offering many built-in functions, including mathematical morphology, and implementation of a many artificial neural networks as AI. It is very popular platform for creation of the specialized program for image analysis, also in pathology. Based on the latest version of Matlab Builder Java toolbox, it is possible to create the software, serving as a remote system for image analysis in pathology via internet communication. The internet platform can be realized based on Java Servlet Pages with Tomcat server as servlet container. In presented software implementation we propose remote image analysis realized by Matlab algorithms. These algorithms can be compiled to executable jar file with the help of Matlab Builder Java toolbox. The Matlab function must be declared with the set of input data, output structure with numerical results and Matlab web figure. Any function prepared in that manner can be used as a Java function in Java Servlet Pages (JSP). The graphical user interface providing the input data and displaying the results (also in graphical form) must be implemented in JSP. Additionally the data storage to database can be implemented within algorithm written in Matlab with the help of Matlab Database Toolbox directly with the image processing. The complete JSP page can be run by Tomcat server. The proposed tool for remote image analysis was tested on the Computerized Analysis of Medical Images (CAMI) software developed by author. The user provides image and case information (diagnosis, staining, image parameter etc.). When analysis is initialized, input data with image are sent to servlet on Tomcat. When analysis is done, client obtains the graphical results as an image with marked recognized cells and also the quantitative output. Additionally, the results are stored in a server

Going beyond clustering in MD trajectory analysis: an application to villin headpiece folding.

Directory of Open Access Journals (Sweden)

Aruna Rajan

2010-04-01

Full Text Available Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i they are not data driven, (ii they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA and a non-metric multidimensional scaling (nMDS method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous.
Marketing Mix Formulation for Higher Education: An Integrated Analysis Employing Analytic Hierarchy Process, Cluster Analysis and Correspondence Analysis

Science.gov (United States)

Ho, Hsuan-Fu; Hung, Chia-Chi

2008-01-01

Purpose: The purpose of this paper is to examine how a graduate institute at National Chiayi University (NCYU), by using a model that integrates analytic hierarchy process, cluster analysis and correspondence analysis, can develop effective marketing strategies. Design/methodology/approach: This is primarily a quantitative study aimed at…
Feature-Space Clustering for fMRI Meta-Analysis

DEFF Research Database (Denmark)

Goutte, Cyril; Hansen, Lars Kai; Liptrot, Mathew G.

2001-01-01

MRI sequences containing several hundreds of images, it is sometimes necessary to invoke feature extraction to reduce the dimensionality of the data space. A second interesting application is in the meta-analysis of fMRI experiment, where features are obtained from a possibly large number of single......-voxel analyses. In particular this allows the checking of the differences and agreements between different methods of analysis. Both approaches are illustrated on a fMRI data set involving visual stimulation, and we show that the feature space clustering approach yields nontrivial results and, in particular......, shows interesting differences between individual voxel analysis performed with traditional methods. © 2001 Wiley-Liss, Inc....
Systematic analysis of rocky shore platform morphology at large spatial scale using LiDAR-derived digital elevation models

Science.gov (United States)

Matsumoto, Hironori; Dickson, Mark E.; Masselink, Gerd

2017-06-01

Much of the existing research on rocky shore platforms describes results from carefully selected field sites, or comparisons between a relatively small number of selected sites. Here we describe a method to systematically analyse rocky shore morphology over a large area using LiDAR-derived digital elevation models. The method was applied to 700 km of coastline in southwest England; a region where there is considerable variation in wave climate and lithological settings, and a large alongshore variation in tidal range. Across-shore profiles were automatically extracted at 50 m intervals around the coast where information was available from the Coastal Channel Observatory coastal classification. Routines were developed to automatically remove non-platform profiles. The remaining 612 shore platform profiles were then subject to automated morphometric analyses, and correlation analysis in respect to three possible environmental controls: wave height, mean spring tidal range and rock strength. As expected, considerable scatter exists in the correlation analysis because only very coarse estimates of rock strength and wave height were applied, whereas variability in factors such as these can locally be the most important control on shoreline morphology. In view of this, it is somewhat surprising that overall consistency was found between previous published findings and the results from the systematic, automated analysis of LiDAR data: platform gradient increases as rock strength and tidal range increase, but decreases as wave height increases; platform width increases as wave height and tidal range increase, but decreases as rock strength increases. Previous studies have predicted shore platform gradient using tidal range alone. A multi-regression analysis of LiDAR data confirms that tidal range is the strongest predictor, but a new multi-factor empirical model considering tidal range, wave height, and rock strength yields better predictions of shore platform gradient
Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

Science.gov (United States)

Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

2012-08-01

The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.
The VISPA Internet Platform for Students

Science.gov (United States)

Asseldonk, D. v.; Erdmann, M.; Fischer, R.; Glaser, C.; Müller, G.; Quast, T.; Rieger, M.; Urban, M.

2016-04-01

The VISPA internet platform enables users to remotely run Python scripts and view resulting plots or inspect their output data. With a standard web browser as the only user requirement on the client-side, the system becomes suitable for blended learning approaches for university physics students. VISPA was used in two consecutive years each by approx. 100 third year physics students at the RWTH Aachen University for their homework assignments. For example, in one exercise students gained a deeper understanding of Einsteins mass-energy relation by analyzing experimental data of electron-positron pairs revealing J / Ψ and Z particles. Because the students were free to choose their working hours, only few users accessed the platform simultaneously. The positive feedback from students and the stability of the platform lead to further development of the concept. This year, students accessed the platform in parallel while they analyzed the data recorded by demonstrated experiments live in the lecture hall. The platform is based on experience in the development of professional analysis tools. It combines core technologies from previous projects: an object-oriented C++ library, a modular data-driven analysis flow, and visual analysis steering. We present the platform and discuss its benefits in the context of teaching based on surveys that are conducted each semester.
Grey Wolf Optimizer Based on Powell Local Optimization Method for Clustering Analysis

Directory of Open Access Journals (Sweden)

Sen Zhang

2015-01-01

Full Text Available One heuristic evolutionary algorithm recently proposed is the grey wolf optimizer (GWO, inspired by the leadership hierarchy and hunting mechanism of grey wolves in nature. This paper presents an extended GWO algorithm based on Powell local optimization method, and we call it PGWO. PGWO algorithm significantly improves the original GWO in solving complex optimization problems. Clustering is a popular data analysis and data mining technique. Hence, the PGWO could be applied in solving clustering problems. In this study, first the PGWO algorithm is tested on seven benchmark functions. Second, the PGWO algorithm is used for data clustering on nine data sets. Compared to other state-of-the-art evolutionary algorithms, the results of benchmark and data clustering demonstrate the superior performance of PGWO algorithm.
Performance Analysis of a Cluster-Based MAC Protocol for Wireless Ad Hoc Networks

Directory of Open Access Journals (Sweden)

Jesús Alonso-Zárate

2010-01-01

Full Text Available An analytical model to evaluate the non-saturated performance of the Distributed Queuing Medium Access Control Protocol for Ad Hoc Networks (DQMANs in single-hop networks is presented in this paper. DQMAN is comprised of a spontaneous, temporary, and dynamic clustering mechanism integrated with a near-optimum distributed queuing Medium Access Control (MAC protocol. Clustering is executed in a distributed manner using a mechanism inspired by the Distributed Coordination Function (DCF of the IEEE 802.11. Once a station seizes the channel, it becomes the temporary clusterhead of a spontaneous cluster and it coordinates the peer-to-peer communications between the clustermembers. Within each cluster, a near-optimum distributed queuing MAC protocol is executed. The theoretical performance analysis of DQMAN in single-hop networks under non-saturation conditions is presented in this paper. The approach integrates the analysis of the clustering mechanism into the MAC layer model. Up to the knowledge of the authors, this approach is novel in the literature. In addition, the performance of an ad hoc network using DQMAN is compared to that obtained when using the DCF of the IEEE 802.11, as a benchmark reference.
Porting of a serial molecular dynamics code on MIMD platforms

Energy Technology Data Exchange (ETDEWEB)

Celino, M. [ENEA Centro Ricerche Casaccia, S. Maria di Galeria, RM (Italy). HPCN Project

1999-07-01

A molecular dynamics (MD) code, utilized for the study of atomistic models of metallic systems has been parallelized for MIMD (multiple instructions multiple data) parallel platforms by means of the parallel virtual machine (PVM) message passing library. Since the parallelization implies modifications of the sequential algorithms, these are described from the point of view of the statistical mechanical theory. Furthermore, techniques and parallelization strategies utilized and the MD parallel code are described in detail. Benchmarks on several MIMD platforms (IBM SP1, SP2, Cray T3D, cluster of workstations) allow performances evaluation of the code versus the different characteristics of the parallel platforms. [Italian] Un codice seriale di dinamica molecolare (MD) utilizzato per lo studio di modelli atomici di materiali metallici e' stato parallelizzato per piattaforme parallele MIMD (multiple instructions multiple data) utilizzando librerie del parallel virtual machine (PVM). Poiche' l'operazione di parallelizzazione ha implicato la modifica degli algoritmi seriali del codice, questi vengono descritti ripercorrendo i concetti fondamentali della meccanica statistica. Inoltre sono presentate le tecniche e le strategie di parallelizzazione utilizzate descrivendo in dettaglio il codice parallelo di MD: Risultati di benchmark su diverse piattaforme MIMD (IBM SP1, SP2, Cray T3D, cluster of workstations) permettono di analizzare le performances del codice in funzione delle differenti caratteristiche delle piattaforme parallele.
The identification of credit card encoders by hierarchical cluster analysis of the jitters of magnetic stripes.

Science.gov (United States)

Leung, S C; Fung, W K; Wong, K H

1999-01-01

The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.
Clustering applications in financial and economic analysis of the crop production in the Russian regions

Directory of Open Access Journals (Sweden)

Gromov Vladislav Vladimirovich

2013-08-01

Full Text Available We used the complex mathematical modeling, multivariate statistical-analysis, fuzzy sets to analyze the financial and economic state of the crop production in Russian regions. We developed a system of indicators, detecting the state agricultural sector in the region, based on the results of correlation, factor, cluster analysis and statistics of the Federal State Statistics Service. We performed clustering analyses to divide regions of Russia on selected factors into five groups. A qualitative and quantitative characteristics of each cluster was received.
Ecosystem health pattern analysis of urban clusters based on emergy synthesis: Results and implication for management

International Nuclear Information System (INIS)

Su, Meirong; Fath, Brian D.; Yang, Zhifeng; Chen, Bin; Liu, Gengyuan

2013-01-01

The evaluation of ecosystem health in urban clusters will help establish effective management that promotes sustainable regional development. To standardize the application of emergy synthesis and set pair analysis (EM–SPA) in ecosystem health assessment, a procedure for using EM–SPA models was established in this paper by combining the ability of emergy synthesis to reflect health status from a biophysical perspective with the ability of set pair analysis to describe extensive relationships among different variables. Based on the EM–SPA model, the relative health levels of selected urban clusters and their related ecosystem health patterns were characterized. The health states of three typical Chinese urban clusters – Jing-Jin-Tang, Yangtze River Delta, and Pearl River Delta – were investigated using the model. The results showed that the health status of the Pearl River Delta was relatively good; the health for the Yangtze River Delta was poor. As for the specific health characteristics, the Pearl River Delta and Yangtze River Delta urban clusters were relatively strong in Vigor, Resilience, and Urban ecosystem service function maintenance, while the Jing-Jin-Tang was relatively strong in organizational structure and environmental impact. Guidelines for managing these different urban clusters were put forward based on the analysis of the results of this study. - Highlights: • The use of integrated emergy synthesis and set pair analysis model was standardized. • The integrated model was applied on the scale of an urban cluster. • Health patterns of different urban clusters were compared. • Policy suggestions were provided based on the health pattern analysis
Differences Between Ward's and UPGMA Methods of Cluster Analysis: Implications for School Psychology.

Science.gov (United States)

Hale, Robert L.; Dougherty, Donna

1988-01-01

Compared the efficacy of two methods of cluster analysis, the unweighted pair-groups method using arithmetic averages (UPGMA) and Ward's method, for students grouped on intelligence, achievement, and social adjustment by both clustering methods. Found UPGMA more efficacious based on output, on cophenetic correlation coefficients generated by each…
Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

Science.gov (United States)

Chan, Julia Y. K.; Bauer, Christopher F.

2014-01-01

The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…
Approximate fuzzy C-means (AFCM) cluster analysis of medical magnetic resonance image (MRI) data

International Nuclear Information System (INIS)

DelaPaz, R.L.; Chang, P.J.; Bernstein, R.; Dave, J.V.

1987-01-01

The authors describe the application of an approximate fuzzy C-means (AFCM) clustering algorithm as a data dimension reduction approach to medical magnetic resonance images (MRI). Image data consisted of one T1-weighted, two T2-weighted, and one T2*-weighted (magnetic susceptibility) image for each cranial study and a matrix of 10 images generated from 10 combinations of TE and TR for each body lymphoma study. All images were obtained with a 1.5 Tesla imaging system (GE Signa). Analyses were performed on over 100 MR image sets with a variety of pathologies. The cluster analysis was operated in an unsupervised mode and computational overhead was minimized by utilizing a table look-up approach without adversely affecting accuracy. Image data were first segmented into 2 coarse clusters, each of which was then subdivided into 16 fine clusters. The final tissue classifications were presented as color-coded anatomically-mapped images and as two and three dimensional displays of cluster center data in selected feature space (minimum spanning tree). Fuzzy cluster analysis appears to be a clinically useful dimension reduction technique which results in improved diagnostic specificity of medical magnetic resonance images
Applications of Cluster Analysis to the Creation of Perfectionism Profiles: A Comparison of two Clustering Approaches

Directory of Open Access Journals (Sweden)

Jocelyn H Bolin

2014-04-01

Full Text Available Although traditional clustering methods (e.g., K-means have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.
Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches.

Science.gov (United States)

Bolin, Jocelyn H; Edwards, Julianne M; Finch, W Holmes; Cassady, Jerrell C

2014-01-01

Although traditional clustering methods (e.g., K-means) have been shown to be useful in the social sciences it is often difficult for such methods to handle situations where clusters in the population overlap or are ambiguous. Fuzzy clustering, a method already recognized in many disciplines, provides a more flexible alternative to these traditional clustering methods. Fuzzy clustering differs from other traditional clustering methods in that it allows for a case to belong to multiple clusters simultaneously. Unfortunately, fuzzy clustering techniques remain relatively unused in the social and behavioral sciences. The purpose of this paper is to introduce fuzzy clustering to these audiences who are currently relatively unfamiliar with the technique. In order to demonstrate the advantages associated with this method, cluster solutions of a common perfectionism measure were created using both fuzzy clustering and K-means clustering, and the results compared. Results of these analyses reveal that different cluster solutions are found by the two methods, and the similarity between the different clustering solutions depends on the amount of cluster overlap allowed for in fuzzy clustering.
Analysis of risk factors for cluster behavior of dental implant failures.

Science.gov (United States)

Chrcanovic, Bruno Ramos; Kisch, Jenö; Albrektsson, Tomas; Wennerberg, Ann

2017-08-01

Some studies indicated that implant failures are commonly concentrated in few patients. To identify and analyze cluster behavior of dental implant failures among subjects of a retrospective study. This retrospective study included patients receiving at least three implants only. Patients presenting at least three implant failures were classified as presenting a cluster behavior. Univariate and multivariate logistic regression models and generalized estimating equations analysis evaluated the effect of explanatory variables on the cluster behavior. There were 1406 patients with three or more implants (8337 implants, 592 failures). Sixty-seven (4.77%) patients presented cluster behavior, with 56.8% of all implant failures. The intake of antidepressants and bruxism were identified as potential negative factors exerting a statistically significant influence on a cluster behavior at the patient-level. The negative factors at the implant-level were turned implants, short implants, poor bone quality, age of the patient, the intake of medicaments to reduce the acid gastric production, smoking, and bruxism. A cluster pattern among patients with implant failure is highly probable. Factors of interest as predictors for implant failures could be a number of systemic and local factors, although a direct causal relationship cannot be ascertained. © 2017 Wiley Periodicals, Inc.
Analysis of genetic association using hierarchical clustering and cluster validation indices.

Science.gov (United States)

Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

2017-10-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
The Logic of Digital Platform Disruption

DEFF Research Database (Denmark)

Kazan, Erol; Tan, Chee-Wee; Lim, Eric T. K.

Digital platforms are disruptive IT artifacts, because they facilitate the quick release of innovative platform derivatives from third parties (e.g., apps). This study endeavours to unravel the disruptive potential, caused by distinct designs and configurations of digital platforms on market...... environments. We postulate that the disruptive potential of digital platforms is determined by the degree of alignment among the business, technology and platform profiles. Furthermore, we argue that the design and configuration of the aforementioned three elements dictates the extent to which open innovation...... is permitted. To shed light on the disruptive potential of digital platforms, we opted for payment platforms as our unit of analysis. Through interviews with experts and payment providers, we seek to gain an in-depth appreciation of how contemporary digital payment platforms are designed and configured...

Tracking Undergraduate Student Achievement in a First-Year Physiology Course Using a Cluster Analysis Approach

Science.gov (United States)

Brown, S. J.; White, S.; Power, N.

2015-01-01

A cluster analysis data classification technique was used on assessment scores from 157 undergraduate nursing students who passed 2 successive compulsory courses in human anatomy and physiology. Student scores in five summative assessment tasks, taken in each of the courses, were used as inputs for a cluster analysis procedure. We aimed to group…
CLUSTERING ANALYSIS OF OFFICER'S BEHAVIOURS IN LONDON POLICE FOOT PATROL ACTIVITIES

Directory of Open Access Journals (Sweden)

J. Shen

2015-07-01

Full Text Available In this small paper we aim at presenting a framework of conceptual representation and clustering analysis of police officers’ patrol pattern obtained from mining their raw movement trajectory data. This have been achieved by a model developed to accounts for the spatio-temporal dynamics human movements by incorporating both the behaviour features of the travellers and the semantic meaning of the environment they are moving in. Hence, the similarity metric of traveller behaviours is jointly defined according to the stay time allocation in each Spatio-temporal region of interests (ST-ROI to support clustering analysis of patrol behaviours. The proposed framework enables the analysis of behaviour and preferences on higher level based on raw moment trajectories. The model is firstly applied to police patrol data provided by the Metropolitan Police and will be tested by other type of dataset afterwards.
Melodic pattern discovery by structural analysis via wavelets and clustering techniques

DEFF Research Database (Denmark)

Velarde, Gissel; Meredith, David

We present an automatic method to support melodic pattern discovery by structural analysis of symbolic representations by means of wavelet analysis and clustering techniques. In previous work, we used the method to recognize the parent works of melodic segments, or to classify tunes into tune......-means to cluster melodic segments into groups of measured similarity and obtain a raking of the most prototypical melodic segments or patterns and their occurrences. We test the method on the JKU Patterns Development Database and evaluate it based on the ground truth defined by the MIREX 2013 Discovery of Repeated...... Themes & Sections task. We compare the results of our method to the output of geometric approaches. Finally, we discuss about the relevance of our wavelet-based analysis in relation to structure, pattern discovery, similarity and variation, and comment about the considerations of the method when used...
Feasibility study of BES data off-line processing and D/Ds physics analysis on a PC/Linux platform

International Nuclear Information System (INIS)

Rong Gang; He Kanglin; Heng Yuekun; Zhang Chun; Liu Huaimin; Cheng Baosen; Yan Wuguang; Mai Jimao; Zhao Haiwen

2000-01-01

The authors report a feasibility study of BES data off-line processing (BES data off-line reconstruction and Monte Carlo simulation) and D/Ds physics analysis on a PC/Linux platform. The authors compared the results obtained from the PC/Linux with that from HP/UNIX workstation. It shows that PC/Linux platform can do BES data off-line analysis as good as HP/UNIX workstation, and is much powerful and economical
Data Clustering

Science.gov (United States)

Wagstaff, Kiri L.

2012-03-01

On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained
Cluster cosmological analysis with X ray instrumental observables: introduction and testing of AsPIX method

International Nuclear Information System (INIS)

Valotti, Andrea

2016-01-01

Cosmology is one of the fundamental pillars of astrophysics, as such it contains many unsolved puzzles. To investigate some of those puzzles, we analyze X-ray surveys of galaxy clusters. These surveys are possible thanks to the bremsstrahlung emission of the intra-cluster medium. The simultaneous fit of cluster counts as a function of mass and distance provides an independent measure of cosmological parameters such as Ω m , σ s , and the dark energy equation of state w0. A novel approach to cosmological analysis using galaxy cluster data, called top-down, was developed in N. Clerc et al. (2012). This top-down approach is based purely on instrumental observables that are considered in a two-dimensional X-ray color-magnitude diagram. The method self-consistently includes selection effects and scaling relationships. It also provides a means of bypassing the computation of individual cluster masses. My work presents an extension of the top-down method by introducing the apparent size of the cluster, creating a three-dimensional X-ray cluster diagram. The size of a cluster is sensitive to both the cluster mass and its angular diameter, so it must also be included in the assessment of selection effects. The performance of this new method is investigated using a Fisher analysis. In parallel, I have studied the effects of the intrinsic scatter in the cluster size scaling relation on the sample selection as well as on the obtained cosmological parameters. To validate the method, I estimate uncertainties of cosmological parameters with MCMC method Amoeba minimization routine and using two simulated XMM surveys that have an increasing level of complexity. The first simulated survey is a set of toy catalogues of 100 and 10000 deg 2 , whereas the second is a 1000 deg 2 catalogue that was generated using an Aardvark semi-analytical N-body simulation. This comparison corroborates the conclusions of the Fisher analysis. In conclusion, I find that a cluster diagram that accounts
HORIZONTAL BRANCH MORPHOLOGY OF GLOBULAR CLUSTERS: A MULTIVARIATE STATISTICAL ANALYSIS

International Nuclear Information System (INIS)

Jogesh Babu, G.; Chattopadhyay, Tanuka; Chattopadhyay, Asis Kumar; Mondal, Saptarshi

2009-01-01

The proper interpretation of horizontal branch (HB) morphology is crucial to the understanding of the formation history of stellar populations. In the present study a multivariate analysis is used (principal component analysis) for the selection of appropriate HB morphology parameter, which, in our case, is the logarithm of effective temperature extent of the HB (log T effHB ). Then this parameter is expressed in terms of the most significant observed independent parameters of Galactic globular clusters (GGCs) separately for coherent groups, obtained in a previous work, through a stepwise multiple regression technique. It is found that, metallicity ([Fe/H]), central surface brightness (μ v ), and core radius (r c ) are the significant parameters to explain most of the variations in HB morphology (multiple R 2 ∼ 0.86) for GGC elonging to the bulge/disk while metallicity ([Fe/H]) and absolute magnitude (M v ) are responsible for GGC belonging to the inner halo (multiple R 2 ∼ 0.52). The robustness is tested by taking 1000 bootstrap samples. A cluster analysis is performed for the red giant branch (RGB) stars of the GGC belonging to Galactic inner halo (Cluster 2). A multi-episodic star formation is preferred for RGB stars of GGC belonging to this group. It supports the asymptotic giant branch (AGB) model in three episodes instead of two as suggested by Carretta et al. for halo GGC while AGB model is suggested to be revisited for bulge/disk GGC.
Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.

Science.gov (United States)

Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz

2016-07-01

Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

Science.gov (United States)

Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

2017-11-01

The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.
Cytobank: providing an analytics platform for community cytometry data analysis and collaboration.

Science.gov (United States)

Chen, Tiffany J; Kotecha, Nikesh

2014-01-01

Cytometry is used extensively in clinical and laboratory settings to diagnose and track cell subsets in blood and tissue. High-throughput, single-cell approaches leveraging cytometry are developed and applied in the computational and systems biology communities by researchers, who seek to improve the diagnosis of human diseases, map the structures of cell signaling networks, and identify new cell types. Data analysis and management present a bottleneck in the flow of knowledge from bench to clinic. Multi-parameter flow and mass cytometry enable identification of signaling profiles of patient cell samples. Currently, this process is manual, requiring hours of work to summarize multi-dimensional data and translate these data for input into other analysis programs. In addition, the increase in the number and size of collaborative cytometry studies as well as the computational complexity of analytical tools require the ability to assemble sufficient and appropriately configured computing capacity on demand. There is a critical need for platforms that can be used by both clinical and basic researchers who routinely rely on cytometry. Recent advances provide a unique opportunity to facilitate collaboration and analysis and management of cytometry data. Specifically, advances in cloud computing and virtualization are enabling efficient use of large computing resources for analysis and backup. An example is Cytobank, a platform that allows researchers to annotate, analyze, and share results along with the underlying single-cell data.
Social Media Use and Depression and Anxiety Symptoms: A Cluster Analysis.

Science.gov (United States)

Shensa, Ariel; Sidani, Jaime E; Dew, Mary Amanda; Escobar-Viera, César G; Primack, Brian A

2018-03-01

Individuals use social media with varying quantity, emotional, and behavioral at- tachment that may have differential associations with mental health outcomes. In this study, we sought to identify distinct patterns of social media use (SMU) and to assess associations between those patterns and depression and anxiety symptoms. In October 2014, a nationally-representative sample of 1730 US adults ages 19 to 32 completed an online survey. Cluster analysis was used to identify patterns of SMU. Depression and anxiety were measured using respective 4-item Patient-Reported Outcome Measurement Information System (PROMIS) scales. Multivariable logistic regression models were used to assess associations between clus- ter membership and depression and anxiety. Cluster analysis yielded a 5-cluster solu- tion. Participants were characterized as "Wired," "Connected," "Diffuse Dabblers," "Concentrated Dabblers," and "Unplugged." Membership in 2 clusters - "Wired" and "Connected" - increased the odds of elevated depression and anxiety symptoms (AOR = 2.7, 95% CI = 1.5-4.7; AOR = 3.7, 95% CI = 2.1-6.5, respectively, and AOR = 2.0, 95% CI = 1.3-3.2; AOR = 2.0, 95% CI = 1.3-3.1, respectively). SMU pattern characterization of a large population suggests 2 pat- terns are associated with risk for depression and anxiety. Developing educational interventions that address use patterns rather than single aspects of SMU (eg, quantity) would likely be useful.
SNPMClust: Bivariate Gaussian Genotype Clustering and Calling for Illumina Microarrays

Directory of Open Access Journals (Sweden)

Stephen W. Erickson

2016-07-01

Full Text Available SNPMClust is an R package for genotype clustering and calling with Illumina microarrays. It was originally developed for studies using the GoldenGate custom genotyping platform but can be used with other Illumina platforms, including Infinium BeadChip. The algorithm first rescales the fluorescent signal intensity data, adds empirically derived pseudo-data to minor allele genotype clusters, then uses the package mclust for bivariate Gaussian model fitting. We compared the accuracy and sensitivity of SNPMClust to that of GenCall, Illumina's proprietary algorithm, on a data set of 94 whole-genome amplified buccal (cheek swab DNA samples. These samples were genotyped on a custom panel which included 1064 SNPs for which the true genotype was known with high confidence. SNPMClust produced uniformly lower false call rates over a wide range of overall call rates.
A microwell cell culture platform for the aggregation of pancreatic β-cells.

Science.gov (United States)

Bernard, Abigail B; Lin, Chien-Chi; Anseth, Kristi S

2012-08-01

Cell-cell contact between pancreatic β-cells is important for maintaining survival and normal insulin secretion. Various techniques have been developed to promote cell-cell contact between β-cells, but a simple yet robust method that affords precise control over three-dimensional (3D) β-cell cluster size has not been demonstrated. To address this need, we developed a poly(ethylene glycol) (PEG) hydrogel microwell platform using photolithography. This microwell cell-culture platform promotes the formation of 3D β-cell aggregates of defined sizes from 25 to 210 μm in diameter. Using this platform, mouse insulinoma 6 (MIN6) β-cells formed aggregates with cell-cell adherin junctions. These naturally formed cell aggregates with controllable sizes can be removed from the microwells for macroencapsulation, implantation, or other biological assays. When removed and subsequently encapsulated in PEG hydrogels, the aggregated cell clusters demonstrated improved cellular viability (>90%) over 7 days in culture, while the β-cells encapsulated as single cells maintained only 20% viability. Aggregated MIN6 cells also exhibited more than fourfold higher insulin secretion in response to a glucose challenge compared with encapsulated single β-cells. Further, the cell aggregates stained positively for E-cadherin, indicative of the formation of cell junctions. Using this hydrogel microwell cell-culture method, viable and functional β-cell aggregates of specific sizes were created, providing a platform from which other biologically relevant questions may be answered.
clusterMaker: a multi-algorithm clustering plugin for Cytoscape

Directory of Open Access Journals (Sweden)

Morris John H

2011-11-01

Full Text Available Abstract Background In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view, k-means, k-medoid, SCPS, AutoSOME, and native (Java MCL. Results Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. Conclusions The Cytoscape plugin cluster
Integrating PROOF Analysis in Cloud and Batch Clusters

International Nuclear Information System (INIS)

Rodríguez-Marrero, Ana Y; Fernández-del-Castillo, Enol; López García, Álvaro; Marco de Lucas, Jesús; Matorras Weinig, Francisco; González Caballero, Isidro; Cuesta Noriega, Alberto

2012-01-01

High Energy Physics (HEP) analysis are becoming more complex and demanding due to the large amount of data collected by the current experiments. The Parallel ROOT Facility (PROOF) provides researchers with an interactive tool to speed up the analysis of huge volumes of data by exploiting parallel processing on both multicore machines and computing clusters. The typical PROOF deployment scenario is a permanent set of cores configured to run the PROOF daemons. However, this approach is incapable of adapting to the dynamic nature of interactive usage. Several initiatives seek to improve the use of computing resources by integrating PROOF with a batch system, such as Proof on Demand (PoD) or PROOF Cluster. These solutions are currently in production at Universidad de Oviedo and IFCA and are positively evaluated by users. Although they are able to adapt to the computing needs of users, they must comply with the specific configuration, OS and software installed at the batch nodes. Furthermore, they share the machines with other workloads, which may cause disruptions in the interactive service for users. These limitations make PROOF a typical use-case for cloud computing. In this work we take profit from Cloud Infrastructure at IFCA in order to provide a dynamic PROOF environment where users can control the software configuration of the machines. The Proof Analysis Framework (PAF) facilitates the development of new analysis and offers a transparent access to PROOF resources. Several performance measurements are presented for the different scenarios (PoD, SGE and Cloud), showing a speed improvement closely correlated with the number of cores used.
Segmentation of Residential Gas Consumers Using Clustering Analysis

Directory of Open Access Journals (Sweden)

Marta P. Fernandes

2017-12-01

Full Text Available The growing environmental concerns and liberalization of energy markets have resulted in an increased competition between utilities and a strong focus on efficiency. To develop new energy efficiency measures and optimize operations, utilities seek new market-related insights and customer engagement strategies. This paper proposes a clustering-based methodology to define the segmentation of residential gas consumers. The segments of gas consumers are obtained through a detailed clustering analysis using smart metering data. Insights are derived from the segmentation, where the segments result from the clustering process and are characterized based on the consumption profiles, as well as according to information regarding consumers’ socio-economic and household key features. The study is based on a sample of approximately one thousand households over one year. The representative load profiles of consumers are essentially characterized by two evident consumption peaks, one in the morning and the other in the evening, and an off-peak consumption. Significant insights can be derived from this methodology regarding typical consumption curves of the different segments of consumers in the population. This knowledge can assist energy utilities and policy makers in the development of consumer engagement strategies, demand forecasting tools and in the design of more sophisticated tariff systems.
Comparing 3 dietary pattern methods--cluster analysis, factor analysis, and index analysis--With colorectal cancer risk: The NIH-AARP Diet and Health Study.

Science.gov (United States)

Reedy, Jill; Wirfält, Elisabet; Flood, Andrew; Mitrou, Panagiota N; Krebs-Smith, Susan M; Kipnis, Victor; Midthune, Douglas; Leitzmann, Michael; Hollenbeck, Albert; Schatzkin, Arthur; Subar, Amy F

2010-02-15

The authors compared dietary pattern methods-cluster analysis, factor analysis, and index analysis-with colorectal cancer risk in the National Institutes of Health (NIH)-AARP Diet and Health Study (n = 492,306). Data from a 124-item food frequency questionnaire (1995-1996) were used to identify 4 clusters for men (3 clusters for women), 3 factors, and 4 indexes. Comparisons were made with adjusted relative risks and 95% confidence intervals, distributions of individuals in clusters by quintile of factor and index scores, and health behavior characteristics. During 5 years of follow-up through 2000, 3,110 colorectal cancer cases were ascertained. In men, the vegetables and fruits cluster, the fruits and vegetables factor, the fat-reduced/diet foods factor, and all indexes were associated with reduced risk; the meat and potatoes factor was associated with increased risk. In women, reduced risk was found with the Healthy Eating Index-2005 and increased risk with the meat and potatoes factor. For men, beneficial health characteristics were seen with all fruit/vegetable patterns, diet foods patterns, and indexes, while poorer health characteristics were found with meat patterns. For women, findings were similar except that poorer health characteristics were seen with diet foods patterns. Similarities were found across methods, suggesting basic qualities of healthy diets. Nonetheless, findings vary because each method answers a different question.
Characterization of population exposure to organochlorines: A cluster analysis application

NARCIS (Netherlands)

R.M. Guimarães (Raphael Mendonça); S. Asmus (Sven); A. Burdorf (Alex)

2013-01-01

textabstractThis study aimed to show the results from a cluster analysis application in the characterization of population exposure to organochlorines through variables related to time and exposure dose. Characteristics of 354 subjects in a population exposed to organochlorine pesticides residues
Security analysis of Aspiro Music Platform, a digital music streaming service

OpenAIRE

Kachanovskiy, Roman

2010-01-01

The report is mainly based on recommendations given by the National Institute of Standards and Technology in special publication 800-30 ``Risk Management Guide for Information Technology Systems''. The risk analysis presented in this report emphasizes a qualitative approach. Firstly, the security requirements for Aspiro Music Platform were identified and classified by the level of importance. Secondly, potential threats to the system were discussed. In the next step the potential system ...
Building the Future Air Force: Analysis of Platform versus Weapon Development

Science.gov (United States)

2016-05-26

fight a limited war in Southeast Asia . Despite its relatively poor record in aerial combat, the F-105 conducted most of the difficult work in the...PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Sd. PROJECT NUMBER Maj Ryan W. Ellis Se. TASK NUMBER Sf. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND...Specifically, the analysis assesses platform survivability, flexibility , and payload; and weapon yield and precision. The monograph reveals the lag

Cluster Analysis of the International Stellarator Confinement Database

International Nuclear Information System (INIS)

Kus, A.; Dinklage, A.; Preuss, R.; Ascasibar, E.; Harris, J. H.; Okamura, S.; Yamada, H.; Sano, F.; Stroth, U.; Talmadge, J.

2008-01-01

Heterogeneous structure of collected data is one of the problems that occur during derivation of scalings for energy confinement time, and whose analysis tourns out to be wide and complicated matter. The International Stellarator Confinement Database [1], shortly ISCDB, comprises in its latest version 21 a total of 3647 observations from 8 experimental devices, 2067 therefrom beeing so far completed for upcoming analyses. For confinement scaling studies 1933 observation were chosen as the standard dataset. Here we describe a statistical method of cluster analysis for identification of possible cohesive substructures in ISDCB and present some preliminary results
Paternal age related schizophrenia (PARS): Latent subgroups detected by k-means clustering analysis.

Science.gov (United States)

Lee, Hyejoo; Malaspina, Dolores; Ahn, Hongshik; Perrin, Mary; Opler, Mark G; Kleinhaus, Karine; Harlap, Susan; Goetz, Raymond; Antonius, Daniel

2011-05-01

Paternal age related schizophrenia (PARS) has been proposed as a subgroup of schizophrenia with distinct etiology, pathophysiology and symptoms. This study uses a k-means clustering analysis approach to generate hypotheses about differences between PARS and other cases of schizophrenia. We studied PARS (operationally defined as not having any family history of schizophrenia among first and second-degree relatives and fathers' age at birth ≥ 35 years) in a series of schizophrenia cases recruited from a research unit. Data were available on demographic variables, symptoms (Positive and Negative Syndrome Scale; PANSS), cognitive tests (Wechsler Adult Intelligence Scale-Revised; WAIS-R) and olfaction (University of Pennsylvania Smell Identification Test; UPSIT). We conducted a series of k-means clustering analyses to identify clusters of cases containing high concentrations of PARS. Two analyses generated clusters with high concentrations of PARS cases. The first analysis (N=136; PARS=34) revealed a cluster containing 83% PARS cases, in which the patients showed a significant discrepancy between verbal and performance intelligence. The mean paternal and maternal ages were 41 and 33, respectively. The second analysis (N=123; PARS=30) revealed a cluster containing 71% PARS cases, of which 93% were females; the mean age of onset of psychosis, at 17.2, was significantly early. These results strengthen the evidence that PARS cases differ from other patients with schizophrenia. Hypothesis-generating findings suggest that features of PARS may include a discrepancy between verbal and performance intelligence, and in females, an early age of onset. These findings provide a rationale for separating these phenotypes from others in future clinical, genetic and pathophysiologic studies of schizophrenia and in considering responses to treatment. Copyright © 2011 Elsevier B.V. All rights reserved.
Standardized Effect Size Measures for Mediation Analysis in Cluster-Randomized Trials

Science.gov (United States)

Stapleton, Laura M.; Pituch, Keenan A.; Dion, Eric

2015-01-01

This article presents 3 standardized effect size measures to use when sharing results of an analysis of mediation of treatment effects for cluster-randomized trials. The authors discuss 3 examples of mediation analysis (upper-level mediation, cross-level mediation, and cross-level mediation with a contextual effect) with demonstration of the…
MANAGING THE DEVELOPMENT OF AGRO-INDUSTRIAL CLUSTERS

Directory of Open Access Journals (Sweden)

D. V. Zavyalov

2018-01-01

Full Text Available Purpose: the purpose of the research is to design a concept of management system for agro-industrial clusters as self-organizing systems. The transition to a new technological way was marked not only by breakthrough solutions in the organization of production of goods, works and services, but also by the emergence of new (in some cases unique forms of inter-firm cooperation and interaction of economic agents in the real and financial sector of the economy. The concept of "digital economy" becomes the most important in economic research, and moreover - from a practical point of view, modern digitalization technologies in managing the activities of economic entities form new information and communication platforms for economic and scientific exchange. The penetration of digital technologies into life is one of the characteristic features of the future world. Not an exception is the agro-industrial sector, which is both strategically important for ensuring food security and has a high export potential. The article presents the concept of managing the development of agro-industrial clusters as self-organizing systems capable of integrating the activities of small and medium-sized businesses into the value-added chain based on modern information technologies. The obligatory and providing tools, mechanisms for implementing the concept, aimed at eliminating existing problems on the way of forming agro-industrial clusters, are disclosed.Methods: the agro-industrial cluster management model is developed using the methods of economic analysis and synthesis, and functional modelling.Results: conceptual model of cluster development management is presented to be used for the nascent clusters and the development of existing agro-industrial clusters.Conclusions and Relevance: as a result of the conducted research the reasons interfering development of cluster approach in the agroindustrial sector of economy are defined. Among them the main are: lack of
State of the art of parallel scientific visualization applications on PC clusters; Etat de l'art des applications de visualisation scientifique paralleles sur grappes de PC

Energy Technology Data Exchange (ETDEWEB)

Juliachs, M

2004-07-01

In this state of the art on parallel scientific visualization applications on PC clusters, we deal with both surface and volume rendering approaches. We first analyze available PC cluster configurations and existing parallel rendering software components for parallel graphics rendering. CEA/DIF has been studying cluster visualization since 2001. This report is part of a study to set up a new visualization research platform. This platform consisting of an eight-node PC cluster under Linux and a tiled display was installed in collaboration with Versailles-Saint-Quentin University in August 2003. (author)
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

Science.gov (United States)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
BioFoV - An open platform for forensic video analysis and biometric data extraction

DEFF Research Database (Denmark)

Almeida, Miguel; Correia, Paulo Lobato; Larsen, Peter Kastmand

2016-01-01

to tailor-made software, based on state of art knowledge in fields such as soft biometrics, gait recognition, photogrammetry, etc. This paper proposes an open and extensible platform, BioFoV (Biometric Forensic Video tool), for forensic video analysis and biometric data extraction, aiming to host some...... of the developments that researchers come up with for solving specific problems, but that are often not shared with the community. BioFoV includes a simple to use Graphical User Interface (GUI), is implemented with open software that can run in multiple software platforms, and its implementation is publicly available....
piRNA analysis framework from small RNA-Seq data by a novel cluster prediction tool - PILFER.

Science.gov (United States)

Ray, Rishav; Pandey, Priyanka

2017-12-19

With the increasing number of studies focusing on PIWI-interacting RNA (piRNAs), it is now pertinent to develop efficient tools dedicated towards piRNA analysis. We have developed a novel cluster prediction tool called PILFER (PIrna cLuster FindER), which can accurately predict piRNA clusters from small RNA sequencing data. PILFER is an open source, easy to use tool, and can be executed even on a personal computer with minimum resources. It uses a sliding-window mechanism by integrating the expression of the reads along with the spatial information to predict the piRNA clusters. We have additionally defined a piRNA analysis pipeline incorporating PILFER to detect and annotate piRNAs and their clusters from raw small RNA sequencing data and implemented it on publicly available data from healthy germline and somatic tissues. We compared PILFER with other existing piRNA cluster prediction tools and found it to be statistically more accurate and superior in many aspects such as the robustness of PILFER clusters is higher and memory efficiency is more. Overall, PILFER provides a fast and accurate solution to piRNA cluster prediction. Copyright © 2017 Elsevier Inc. All rights reserved.
Network clustering coefficient approach to DNA sequence analysis

Energy Technology Data Exchange (ETDEWEB)

Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

2006-05-15

In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.
Robustness in cluster analysis in the presence of anomalous observations

NARCIS (Netherlands)

Zhuk, EE

Cluster analysis of multivariate observations in the presence of "outliers" (anomalous observations) in a sample is studied. The expected (mean) fraction of erroneous decisions for the decision rule is computed analytically by minimizing the intraclass scatter. A robust decision rule (stable to
Adoption of Mobile Payment Platforms

DEFF Research Database (Denmark)

Staykova, Kalina Stefanova; Damsgaard, Jan

2016-01-01

Numerous mobile payment solutions, which rely on new disruptive technologies, have been launched on the payment market in recent years. But despite the growing number of mobile payment apps, very few solutions have turned to be successful as the majority of them fail to gain a critical mass...... of users. In this paper, we investigate successful platform adoption strategies by using the Reach and Range Framework for Multi-Sided Platforms as a strategic tool to which mobile payment providers can adhere in order to tackle some of the main challenges they face throughout the evolution...... of their platforms. The analysis indicates that successful mobile payment solutions tend to be launched as one-sided platforms and then gradually be expanded into being two-sided. Our study showcases that the success of mobile payment platforms lies with the ability of the platform to balance the reach (number...
A hybrid approach to device integration on a genetic analysis platform

International Nuclear Information System (INIS)

Brennan, Des; Justice, John; Aherne, Margaret; Galvin, Paul; Jary, Dorothee; Kurg, Ants; Berik, Evgeny; Macek, Milan

2012-01-01

Point-of-care (POC) systems require significant component integration to implement biochemical protocols associated with molecular diagnostic assays. Hybrid platforms where discrete components are combined in a single platform are a suitable approach to integration, where combining multiple device fabrication steps on a single substrate is not possible due to incompatible or costly fabrication steps. We integrate three devices each with a specific system functionality: (i) a silicon electro-wetting-on-dielectric (EWOD) device to move and mix sample and reagent droplets in an oil phase, (ii) a polymer microfluidic chip containing channels and reservoirs and (iii) an aqueous phase glass microarray for fluorescence microarray hybridization detection. The EWOD device offers the possibility of fully integrating on-chip sample preparation using nanolitre sample and reagent volumes. A key challenge is sample transfer from the oil phase EWOD device to the aqueous phase microarray for hybridization detection. The EWOD device, waveguide performance and functionality are maintained during the integration process. An on-chip biochemical protocol for arrayed primer extension (APEX) was implemented for single nucleotide polymorphism (SNiP) analysis. The prepared sample is aspirated from the EWOD oil phase to the aqueous phase microarray for hybridization. A bench-top instrumentation system was also developed around the integrated platform to drive the EWOD electrodes, implement APEX sample heating and image the microarray after hybridization. (paper)
A Platform for Scalable Satellite and Geospatial Data Analysis

Science.gov (United States)

Beneke, C. M.; Skillman, S.; Warren, M. S.; Kelton, T.; Brumby, S. P.; Chartrand, R.; Mathis, M.

2017-12-01

At Descartes Labs, we use the commercial cloud to run global-scale machine learning applications over satellite imagery. We have processed over 5 Petabytes of public and commercial satellite imagery, including the full Landsat and Sentinel archives. By combining open-source tools with a FUSE-based filesystem for cloud storage, we have enabled a scalable compute platform that has demonstrated reading over 200 GB/s of satellite imagery into cloud compute nodes. In one application, we generated global 15m Landsat-8, 20m Sentinel-1, and 10m Sentinel-2 composites from 15 trillion pixels, using over 10,000 CPUs. We recently created a public open-source Python client library that can be used to query and access preprocessed public satellite imagery from within our platform, and made this platform available to researchers for non-commercial projects. In this session, we will describe how you can use the Descartes Labs Platform for rapid prototyping and scaling of geospatial analyses and demonstrate examples in land cover classification.
A comparison of heuristic and model-based clustering methods for dietary pattern analysis.

Science.gov (United States)

Greve, Benjamin; Pigeot, Iris; Huybrechts, Inge; Pala, Valeria; Börnhorst, Claudia

2016-02-01

Cluster analysis is widely applied to identify dietary patterns. A new method based on Gaussian mixture models (GMM) seems to be more flexible compared with the commonly applied k-means and Ward's method. In the present paper, these clustering approaches are compared to find the most appropriate one for clustering dietary data. The clustering methods were applied to simulated data sets with different cluster structures to compare their performance knowing the true cluster membership of observations. Furthermore, the three methods were applied to FFQ data assessed in 1791 children participating in the IDEFICS (Identification and Prevention of Dietary- and Lifestyle-Induced Health Effects in Children and Infants) Study to explore their performance in practice. The GMM outperformed the other methods in the simulation study in 72 % up to 100 % of cases, depending on the simulated cluster structure. Comparing the computationally less complex k-means and Ward's methods, the performance of k-means was better in 64-100 % of cases. Applied to real data, all methods identified three similar dietary patterns which may be roughly characterized as a 'non-processed' cluster with a high consumption of fruits, vegetables and wholemeal bread, a 'balanced' cluster with only slight preferences of single foods and a 'junk food' cluster. The simulation study suggests that clustering via GMM should be preferred due to its higher flexibility regarding cluster volume, shape and orientation. The k-means seems to be a good alternative, being easier to use while giving similar results when applied to real data.
LEDWIRE: A Versatile Networking Platform for Smart LED Lighting Applications Using LIN-Bus and WSNs

Directory of Open Access Journals (Sweden)

Dimitrios D. Piromalis

2016-05-01

Full Text Available In this paper, the architecture of a versatile networking and control platform for Light-Emitting Diode (LED lighting applications is presented, based on embedded wireless and wired networking technologies. All the possible power and control signals distribution topologies of the lighting fixtures are examined with particular focus on dynamic lighting applications with design metrics as the cost, the required wiring installation expenses and maintenance complexity. The proposed platform is optimized for applications where the grouping of LED-based lighting fictures clusters is essential, as well as their synchronization. With such an approach, the distributed control and synchronization of LED lighting fixtures' clusters is performed through a versatile network that uses the single wire Local Interconnect Network (LIN bus. The proposed networking platform is presented in terms of its physical layer architecture, its data protocol configuration, and its functionality for smart control. As a proof of concept, the design of a LED lighting fixture together with a LIN-to-IEEE802.15.4/ZigBee data gateway is presented.
Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

Science.gov (United States)

Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

2017-12-01

The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.
Construction and application of Red5 cluster based on OpenStack

Science.gov (United States)

Wang, Jiaqing; Song, Jianxin

2017-08-01

With the application and development of cloud computing technology in various fields, the resource utilization rate of the data center has been improved obviously, and the system based on cloud computing platform has also improved the expansibility and stability. In the traditional way, Red5 cluster resource utilization is low and the system stability is poor. This paper uses cloud computing to efficiently calculate the resource allocation ability, and builds a Red5 server cluster based on OpenStack. Multimedia applications can be published to the Red5 cloud server cluster. The system achieves the flexible construction of computing resources, but also greatly improves the stability of the cluster and service efficiency.
An Efficient MapReduce-Based Parallel Clustering Algorithm for Distributed Traffic Subarea Division

Directory of Open Access Journals (Sweden)

Dawen Xia

2015-01-01

Full Text Available Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs. Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K-Means (Par3PKM algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K-Means and then employ a MapReduce paradigm to redesign the optimized K-Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K-Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.
Intensive time series data exploitation: the Multi-sensor Evolution Analysis (MEA) platform

Science.gov (United States)

Mantovani, Simone; Natali, Stefano; Folegani, Marco; Scremin, Alessandro

2014-05-01

The monitoring of the temporal evolution of natural phenomena must be performed in order to ensure their correct description and to allow improvements in modelling and forecast capabilities. This assumption, that is obvious for ground-based measurements, has not always been true for data collected through space-based platforms: except for geostationary satellites and sensors, that allow providing a very effective monitoring of phenomena with geometric scale from regional to global; smaller phenomena (with characteristic dimension lower than few kilometres) have been monitored with instruments that could collect data only with a time interval in the order of several days; bi-temporal techniques have been the most used ones for years, in order to characterise temporal changes and try identifying specific phenomena. The more the number of flying sensor has grown and their performance improved, the more their capability of monitoring natural phenomena at a smaller geographic scale has grown: we can now count on tenth of years of remotely sensed data, collected by hundreds of sensors that are now accessible from a wide users' community, and the techniques for data processing have to be adapted to move toward a data intensive exploitation. Starting from 2008, the European Space Agency has initiated the development of the Multi-sensor Evolution Analysis (MEA) platform (https://mea.eo.esa.int), whose first aim was to permit the access and exploitation of long term remotely sensed satellite data from different platforms: 15 years of global (A)ATSR data together with 5 years of regional AVNIR-2 data were loaded into the system and were used, through a web-based graphic user interface, for land cover change analysis. The MEA data availability has grown during years integrating multi-disciplinary data that feature spatial and temporal dimensions: so far tenths of Terabytes of data in the land and atmosphere domains are available and can be visualized and exploited, keeping the
Time series clustering analysis of health-promoting behavior

Science.gov (United States)

Yang, Chi-Ta; Hung, Yu-Shiang; Deng, Guang-Feng

2013-10-01

Health promotion must be emphasized to achieve the World Health Organization goal of health for all. Since the global population is aging rapidly, ComCare elder health-promoting service was developed by the Taiwan Institute for Information Industry in 2011. Based on the Pender health promotion model, ComCare service offers five categories of health-promoting functions to address the everyday needs of seniors: nutrition management, social support, exercise management, health responsibility, stress management. To assess the overall ComCare service and to improve understanding of the health-promoting behavior of elders, this study analyzed health-promoting behavioral data automatically collected by the ComCare monitoring system. In the 30638 session records collected for 249 elders from January, 2012 to March, 2013, behavior patterns were identified by fuzzy c-mean time series clustering algorithm combined with autocorrelation-based representation schemes. The analysis showed that time series data for elder health-promoting behavior can be classified into four different clusters. Each type reveals different health-promoting needs, frequencies, function numbers and behaviors. The data analysis result can assist policymakers, health-care providers, and experts in medicine, public health, nursing and psychology and has been provided to Taiwan National Health Insurance Administration to assess the elder health-promoting behavior.

Unequal cluster sizes in stepped-wedge cluster randomised trials: a systematic review.

Science.gov (United States)

Kristunas, Caroline; Morris, Tom; Gray, Laura

2017-11-15

To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. Any, not limited to healthcare settings. Any taking part in an SW-CRT published up to March 2016. The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22-0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
NiftyPET: a High-throughput Software Platform for High Quantitative Accuracy and Precision PET Imaging and Analysis.

Science.gov (United States)

Markiewicz, Pawel J; Ehrhardt, Matthias J; Erlandsson, Kjell; Noonan, Philip J; Barnes, Anna; Schott, Jonathan M; Atkinson, David; Arridge, Simon R; Hutton, Brian F; Ourselin, Sebastien

2018-01-01

We present a standalone, scalable and high-throughput software platform for PET image reconstruction and analysis. We focus on high fidelity modelling of the acquisition processes to provide high accuracy and precision quantitative imaging, especially for large axial field of view scanners. All the core routines are implemented using parallel computing available from within the Python package NiftyPET, enabling easy access, manipulation and visualisation of data at any processing stage. The pipeline of the platform starts from MR and raw PET input data and is divided into the following processing stages: (1) list-mode data processing; (2) accurate attenuation coefficient map generation; (3) detector normalisation; (4) exact forward and back projection between sinogram and image space; (5) estimation of reduced-variance random events; (6) high accuracy fully 3D estimation of scatter events; (7) voxel-based partial volume correction; (8) region- and voxel-level image analysis. We demonstrate the advantages of this platform using an amyloid brain scan where all the processing is executed from a single and uniform computational environment in Python. The high accuracy acquisition modelling is achieved through span-1 (no axial compression) ray tracing for true, random and scatter events. Furthermore, the platform offers uncertainty estimation of any image derived statistic to facilitate robust tracking of subtle physiological changes in longitudinal studies. The platform also supports the development of new reconstruction and analysis algorithms through restricting the axial field of view to any set of rings covering a region of interest and thus performing fully 3D reconstruction and corrections using real data significantly faster. All the software is available as open source with the accompanying wiki-page and test data.
Miniaturization for Point-of-Care Analysis: Platform Technology for Almost Every Biomedical Assay.

Science.gov (United States)

Schumacher, Soeren; Sartorius, Dorian; Ehrentreich-Förster, Eva; Bier, Frank F

2012-10-01

Platform technologies for the changing need of diagnostics are one of the main challenges in medical device technology. From one point-of-view the demand for new and more versatile diagnostic is increasing due to a deeper knowledge of biomarkers and their combination with diseases. From another point-of-view a decentralization of diagnostics will occur since decisions can be made faster resulting in higher success of therapy. Hence, new types of technologies have to be established which enables a multiparameter analysis at the point-of-care. Within this review-like article a system called Fraunhofer ivD-platform is introduced. It consists of a credit-card sized cartridge with integrated reagents, sensors and pumps and a read-out/processing unit. Within the cartridge the assay runs fully automated within 15-20 minutes. Due to the open design of the platform different analyses such as antibody, serological or DNA-assays can be performed. Specific examples of these three different assay types are given to show the broad applicability of the system.
State of the art of parallel scientific visualization applications on PC clusters; Etat de l'art des applications de visualisation scientifique paralleles sur grappes de PC

Energy Technology Data Exchange (ETDEWEB)

Juliachs, M

2004-07-01

In this state of the art on parallel scientific visualization applications on PC clusters, we deal with both surface and volume rendering approaches. We first analyze available PC cluster configurations and existing parallel rendering software components for parallel graphics rendering. CEA/DIF has been studying cluster visualization since 2001. This report is part of a study to set up a new visualization research platform. This platform consisting of an eight-node PC cluster under Linux and a tiled display was installed in collaboration with Versailles-Saint-Quentin University in August 2003. (author)
Applying clustering to statistical analysis of student reasoning about two-dimensional kinematics

Directory of Open Access Journals (Sweden)

R. Padraic Springuel

2007-12-01

Full Text Available We use clustering, an analysis method not presently common to the physics education research community, to group and characterize student responses to written questions about two-dimensional kinematics. Previously, clustering has been used to analyze multiple-choice data; we analyze free-response data that includes both sketches of vectors and written elements. The primary goal of this paper is to describe the methodology itself; we include a brief overview of relevant results.
Comparison of Outputs for Variable Combinations Used in Cluster Analysis on Polarmetric Imagery

National Research Council Canada - National Science Library

Petre, Melinda

2008-01-01

.... More specifically, two techniques, Cluster Analysis (CA) and Principle Component Analysis (PCA) can be combined to process Stoke s imagery by distinguishing between pixels, and producing groups of pixels with similar characteristics...
An analysis of hospital brand mark clusters.

Science.gov (United States)

Vollmers, Stacy M; Miller, Darryl W; Kilic, Ozcan

2010-07-01

This study analyzed brand mark clusters (i.e., various types of brand marks displayed in combination) used by hospitals in the United States. The brand marks were assessed against several normative criteria for creating brand marks that are memorable and that elicit positive affect. Overall, results show a reasonably high level of adherence to many of these normative criteria. Many of the clusters exhibited pictorial elements that reflected benefits and that were conceptually consistent with the verbal content of the cluster. Also, many clusters featured icons that were balanced and moderately complex. However, only a few contained interactive imagery or taglines communicating benefits.
The ESA Geohazard Exploitation Platform

Science.gov (United States)

Bally, Philippe; Laur, Henri; Mathieu, Pierre-Philippe; Pinto, Salvatore

2015-04-01

expanded to address broader objectives of the geohazards community. In particular it is a contribution to the CEOS WG Disasters and its Seismic Hazards Pilot and terrain deformation applications of its Volcano Pilot. The geohazards platform is sourced with elements - data, tools, and processing- relevant to the geohazards theme and related exploitation scenarios. For example, platform provides access to large SAR data collections and services to support SAR Interferometry (InSAR), in particular the Persistent Scatterer Interferometry (PSI) and Small Baseline Subset (SBAS) techniques, to provide precise terrain deformation. The GEP includes data coming from the ENVISAT ASAR and ERS archives, already hosted in the ESA clusters and in ESA's Virtual Archive and further extended to cover the requirements of the CEOS Pilot on Seismic Hazards. The GEP is gradually accessing Sentinel-1A data alongside with EO data from other space agencies with an interest in the geohazard exploitation platform. Further to this, the platform is intended to be available in the framework of the European Plate Observing System (EPOS) initiative, in order to help its users exploit EO data to support solid Earth monitoring and geophysical and geological analysis.
Partitional clustering algorithms

CERN Document Server

2015-01-01

This book summarizes the state-of-the-art in partitional clustering. Clustering, the unsupervised classification of patterns into groups, is one of the most important tasks in exploratory data analysis. Primary goals of clustering include gaining insight into, classifying, and compressing data. Clustering has a long and rich history that spans a variety of scientific disciplines including anthropology, biology, medicine, psychology, statistics, mathematics, engineering, and computer science. As a result, numerous clustering algorithms have been proposed since the early 1950s. Among these algorithms, partitional (nonhierarchical) ones have found many applications, especially in engineering and computer science. This book provides coverage of consensus clustering, constrained clustering, large scale and/or high dimensional clustering, cluster validity, cluster visualization, and applications of clustering. Examines clustering as it applies to large and/or high-dimensional data sets commonly encountered in reali...
Cluster analysis for the probability of DSB site induced by electron tracks

Energy Technology Data Exchange (ETDEWEB)

Yoshii, Y. [Biological Research, Education and Instrumentation Center, Sapporo Medical University, Sapporo 060-8556 (Japan); Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Sasaki, K. [Faculty of Health Sciences, Hokkaido University of Science, Sapporo 006-8585 (Japan); Matsuya, Y. [Graduate School of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan); Date, H., E-mail: date@hs.hokudai.ac.jp [Faculty of Health Sciences, Hokkaido University, Sapporo 060-0812 (Japan)

2015-05-01

To clarify the influence of bio-cells exposed to ionizing radiations, the densely populated pattern of the ionization in the cell nucleus is of importance because it governs the extent of DNA damage which may lead to cell lethality. In this study, we have conducted a cluster analysis of ionization and excitation events to estimate the number of double-strand breaks (DSBs) induced by electron tracks. A Monte Carlo simulation for electrons in liquid water was performed to determine the spatial location of the ionization and excitation events. The events were divided into clusters by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The algorithm enables us to sort out the events into the groups (clusters) in which a minimum number of neighboring events are contained within a given radius. For evaluating the number of DSBs in the extracted clusters, we have introduced an aggregation index (AI). The computational results show that a sub-keV electron produces DSBs in a dense formation more effectively than higher energy electrons. The root-mean square radius (RMSR) of the cluster size is below 5 nm, which is smaller than the chromatin fiber thickness. It was found that this size of clustering events has a high possibility to cause lesions in DNA within the chromatin fiber site.
Cluster Analysis on Longitudinal Data of Patients with Adult-Onset Asthma.

Science.gov (United States)

Ilmarinen, Pinja; Tuomisto, Leena E; Niemelä, Onni; Tommola, Minna; Haanpää, Jussi; Kankaanranta, Hannu

Previous cluster analyses on asthma are based on cross-sectional data. To identify phenotypes of adult-onset asthma by using data from baseline (diagnostic) and 12-year follow-up visits. The Seinäjoki Adult Asthma Study is a 12-year follow-up study of patients with new-onset adult asthma. K-means cluster analysis was performed by using variables from baseline and follow-up visits on 171 patients to identify phenotypes. Five clusters were identified. Patients in cluster 1 (n = 38) were predominantly nonatopic males with moderate smoking history at baseline. At follow-up, 40% of these patients had developed persistent obstruction but the number of patients with uncontrolled asthma (5%) and rhinitis (10%) was the lowest. Cluster 2 (n = 19) was characterized by older men with heavy smoking history, poor lung function, and persistent obstruction at baseline. At follow-up, these patients were mostly uncontrolled (84%) despite daily use of inhaled corticosteroid (ICS) with add-on therapy. Cluster 3 (n = 50) consisted mostly of nonsmoking females with good lung function at diagnosis/follow-up and well-controlled/partially controlled asthma at follow-up. Cluster 4 (n = 25) had obese and symptomatic patients at baseline/follow-up. At follow-up, these patients had several comorbidities (40% psychiatric disease) and were treated daily with ICS and add-on therapy. Patients in cluster 5 (n = 39) were mostly atopic and had the earliest onset of asthma, the highest blood eosinophils, and FEV 1 reversibility at diagnosis. At follow-up, these patients used the lowest ICS dose but 56% were well controlled. Results can be used to predict outcomes of patients with adult-onset asthma and to aid in development of personalized therapy (NCT02733016 at ClinicalTrials.gov). Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Parkinson's Disease Subtypes Identified from Cluster Analysis of Motor and Non-motor Symptoms.

Science.gov (United States)

Mu, Jesse; Chaudhuri, Kallol R; Bielza, Concha; de Pedro-Cuesta, Jesus; Larrañaga, Pedro; Martinez-Martin, Pablo

2017-01-01

Parkinson's disease is now considered a complex, multi-peptide, central, and peripheral nervous system disorder with considerable clinical heterogeneity. Non-motor symptoms play a key role in the trajectory of Parkinson's disease, from prodromal premotor to end stages. To understand the clinical heterogeneity of Parkinson's disease, this study used cluster analysis to search for subtypes from a large, multi-center, international, and well-characterized cohort of Parkinson's disease patients across all motor stages, using a combination of cardinal motor features (bradykinesia, rigidity, tremor, axial signs) and, for the first time, specific validated rater-based non-motor symptom scales. Two independent international cohort studies were used: (a) the validation study of the Non-Motor Symptoms Scale ( n = 411) and (b) baseline data from the global Non-Motor International Longitudinal Study ( n = 540). k -means cluster analyses were performed on the non-motor and motor domains (domains clustering) and the 30 individual non-motor symptoms alone (symptoms clustering), and hierarchical agglomerative clustering was performed to group symptoms together. Four clusters are identified from the domains clustering supporting previous studies: mild, non-motor dominant, motor-dominant, and severe. In addition, six new smaller clusters are identified from the symptoms clustering, each characterized by clinically-relevant non-motor symptoms. The clusters identified in this study present statistical confirmation of the increasingly important role of non-motor symptoms (NMS) in Parkinson's disease heterogeneity and take steps toward subtype-specific treatment packages.
Massively Clustered CubeSats NCPS Demo Mission

Science.gov (United States)

Robertson, Glen A.; Young, David; Kim, Tony; Houts, Mike

2013-01-01

Technologies under development for the proposed Nuclear Cryogenic Propulsion Stage (NCPS) will require an un-crewed demonstration mission before they can be flight qualified over distances and time frames representative of a crewed Mars mission. In this paper, we describe a Massively Clustered CubeSats platform, possibly comprising hundreds of CubeSats, as the main payload of the NCPS demo mission. This platform would enable a mechanism for cost savings for the demo mission through shared support between NASA and other government agencies as well as leveraged commercial aerospace and academic community involvement. We believe a Massively Clustered CubeSats platform should be an obvious first choice for the NCPS demo mission when one considers that cost and risk of the payload can be spread across many CubeSat customers and that the NCPS demo mission can capitalize on using CubeSats developed by others for its own instrumentation needs. Moreover, a demo mission of the NCPS offers an unprecedented opportunity to invigorate the public on a global scale through direct individual participation coordinated through a web-based collaboration engine. The platform we describe would be capable of delivering CubeSats at various locations along a trajectory toward the primary mission destination, in this case Mars, permitting a variety of potential CubeSat-specific missions. Cameras on various CubeSats can also be used to provide multiple views of the space environment and the NCPS vehicle for video monitoring as well as allow the public to "ride along" as virtual passengers on the mission. This collaborative approach could even initiate a brand new Science, Technology, Engineering and Math (STEM) program for launching student developed CubeSat payloads beyond Low Earth Orbit (LEO) on future deep space technology qualification missions. Keywords: Nuclear Propulsion, NCPS, SLS, Mars, CubeSat.
a Three-Step Spatial-Temporal Clustering Method for Human Activity Pattern Analysis

Science.gov (United States)

Huang, W.; Li, S.; Xu, S.

2016-06-01

How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the
A comparison of hierarchical cluster analysis and league table rankings as methods for analysis and presentation of district health system performance data in Uganda.

Science.gov (United States)

Tashobya, Christine K; Dubourg, Dominique; Ssengooba, Freddie; Speybroeck, Niko; Macq, Jean; Criel, Bart

2016-03-01

In 2003, the Uganda Ministry of Health introduced the district league table for district health system performance assessment. The league table presents district performance against a number of input, process and output indicators and a composite index to rank districts. This study explores the use of hierarchical cluster analysis for analysing and presenting district health systems performance data and compares this approach with the use of the league table in Uganda. Ministry of Health and district plans and reports, and published documents were used to provide information on the development and utilization of the Uganda district league table. Quantitative data were accessed from the Ministry of Health databases. Statistical analysis using SPSS version 20 and hierarchical cluster analysis, utilizing Wards' method was used. The hierarchical cluster analysis was conducted on the basis of seven clusters determined for each year from 2003 to 2010, ranging from a cluster of good through moderate-to-poor performers. The characteristics and membership of clusters varied from year to year and were determined by the identity and magnitude of performance of the individual variables. Criticisms of the league table include: perceived unfairness, as it did not take into consideration district peculiarities; and being oversummarized and not adequately informative. Clustering organizes the many data points into clusters of similar entities according to an agreed set of indicators and can provide the beginning point for identifying factors behind the observed performance of districts. Although league table ranking emphasize summation and external control, clustering has the potential to encourage a formative, learning approach. More research is required to shed more light on factors behind observed performance of the different clusters. Other countries especially low-income countries that share many similarities with Uganda can learn from these experiences. © The Author 2015
Local wavelet correlation: applicationto timing analysis of multi-satellite CLUSTER data

Directory of Open Access Journals (Sweden)

J. Soucek

2004-12-01

Full Text Available Multi-spacecraft space observations, such as those of CLUSTER, can be used to infer information about local plasma structures by exploiting the timing differences between subsequent encounters of these structures by individual satellites. We introduce a novel wavelet-based technique, the Local Wavelet Correlation (LWC, which allows one to match the corresponding signatures of large-scale structures in the data from multiple spacecraft and determine the relative time shifts between the crossings. The LWC is especially suitable for analysis of strongly non-stationary time series, where it enables one to estimate the time lags in a more robust and systematic way than ordinary cross-correlation techniques. The technique, together with its properties and some examples of its application to timing analysis of bow shock and magnetopause crossing observed by CLUSTER, are presented. We also compare the performance and reliability of the technique with classical discontinuity analysis methods. Key words. Radio science (signal processing – Space plasma physics (discontinuities; instruments and techniques
Cluster analysis as a method for determining size ranges for spinal implants: disc lumbar replacement prosthesis dimensions from magnetic resonance images.

Science.gov (United States)

Lei, Dang; Holder, Roger L; Smith, Francis W; Wardlaw, Douglas; Hukins, David W L

2006-12-01

Statistical analysis of clinical radiologic data. To develop an objective method for finding the number of sizes for a lumbar disc replacement. Cluster analysis is a well-established technique for sorting observations into clusters so that the "similarity level" is maximal if they belong to the same cluster and minimal otherwise. Magnetic resonance scans from 69 patients, with no abnormal discs, yielded 206 sagittal and transverse images of 206 discs (levels L3-L4-L5-S1). Anteroposterior and lateral dimensions were measured from vertebral margins on transverse images; disc heights were measured from sagittal images. Hierarchical cluster analysis was performed to determine the number of clusters followed by nonhierarchical (K-means) cluster analysis. Discriminant analysis was used to determine how well the clusters could be used to classify an observation. The most successful method of clustering the data involved the following parameters: anteroposterior dimension; lateral dimension (both were the mean of results from the superior and inferior margins of a vertebral body, measured on transverse images); and maximum disc height (from a midsagittal image). These were grouped into 7 clusters so that a discriminant analysis was capable of correctly classifying 97.1% of the observations. The mean and standard deviations for the parameter values in each cluster were determined. Cluster analysis has been successfully used to find the dimensions of the minimum number of prosthesis sizes required to replace L3-L4 to L5-S1 discs; the range of sizes would enable them to be used at higher lumbar levels in some patients.
Cluster analysis for validated climatology stations using precipitation in Mexico

NARCIS (Netherlands)

Bravo Cabrera, J. L.; Azpra-Romero, E.; Zarraluqui-Such, V.; Gay-García, C.; Estrada Porrúa, F.

2012-01-01

Annual average of daily precipitation was used to group climatological stations into clusters using the k-means procedure and principal component analysis with varimax rotation. After a careful selection of the stations deployed in Mexico since 1950, we selected 349 characterized by having 35 to 40
Gene expression data clustering and it’s application in differential analysis of leukemia

Directory of Open Access Journals (Sweden)

M. Vahedi

2008-02-01

Full Text Available Introduction: DNA microarray technique is one of the most important categories in bioinformatics,which allows the possibility of monitoring thousands of expressed genes has been resulted in creatinggiant data bases of gene expression data, recently. Statistical analysis of such databases includednormalization, clustering, classification and etc.Materials and Methods: Golub et al (1999 collected data bases of leukemia based on the method ofoligonucleotide. The data is on the internet. In this paper, we analyzed gene expression data. It wasclustered by several methods including multi-dimensional scaling, hierarchical and non-hierarchicalclustering. Data set included 20 Acute Lymphoblastic Leukemia (ALL patients and 14 Acute MyeloidLeukemia (AML patients. The results of tow methods of clustering were compared with regard to realgrouping (ALL & AML. R software was used for data analysis.Results: Specificity and sensitivity of divisive hierarchical clustering in diagnosing of ALL patientswere 75% and 92%, respectively. Specificity and sensitivity of partitioning around medoids indiagnosing of ALL patients were 90% and 93%, respectively. These results showed a wellaccomplishment of both methods of clustering. It is considerable that, due to clustering methodsresults, one of the samples was placed in ALL groups, which was in AML group in clinical test.Conclusion: With regard to concordance of the results with real grouping of data, therefore we canuse these methods in the cases where we don't have accurate information of real grouping of data.Moreover, Results of clustering might distinct subgroups of data in such a way that would be necessaryfor concordance with clinical outcomes, laboratory results and so on.
Application of clustering analysis in the prediction of photovoltaic power generation based on neural network

Science.gov (United States)

Cheng, K.; Guo, L. M.; Wang, Y. K.; Zafar, M. T.

2017-11-01

In order to select effective samples in the large number of data of PV power generation years and improve the accuracy of PV power generation forecasting model, this paper studies the application of clustering analysis in this field and establishes forecasting model based on neural network. Based on three different types of weather on sunny, cloudy and rainy days, this research screens samples of historical data by the clustering analysis method. After screening, it establishes BP neural network prediction models using screened data as training data. Then, compare the six types of photovoltaic power generation prediction models before and after the data screening. Results show that the prediction model combining with clustering analysis and BP neural networks is an effective method to improve the precision of photovoltaic power generation.

PIVOT: platform for interactive analysis and visualization of transcriptomics data.

Science.gov (United States)

Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong

2018-01-05

Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.
Integrated vehicle-based safety systems (IVBSS) : light vehicle platform field operational test data analysis plan.

Science.gov (United States)

2009-12-22

This document presents the University of Michigan Transportation Research Institutes plan to : perform analysis of data collected from the light vehicle platform field operational test of the : Integrated Vehicle-Based Safety Systems (IVBSS) progr...
Integrated vehicle-based safety systems (IVBSS) : heavy truck platform field operational test data analysis plan.

Science.gov (United States)

2009-11-23

This document presents the University of Michigan Transportation Research Institutes plan to perform : analysis of data collected from the heavy truck platform field operational test of the Integrated Vehicle- : Based Safety Systems (IVBSS) progra...
A user credit assessment model based on clustering ensemble for broadband network new media service supervision

Science.gov (United States)

Liu, Fang; Cao, San-xing; Lu, Rui

2012-04-01

This paper proposes a user credit assessment model based on clustering ensemble aiming to solve the problem that users illegally spread pirated and pornographic media contents within the user self-service oriented broadband network new media platforms. Its idea is to do the new media user credit assessment by establishing indices system based on user credit behaviors, and the illegal users could be found according to the credit assessment results, thus to curb the bad videos and audios transmitted on the network. The user credit assessment model based on clustering ensemble proposed by this paper which integrates the advantages that swarm intelligence clustering is suitable for user credit behavior analysis and K-means clustering could eliminate the scattered users existed in the result of swarm intelligence clustering, thus to realize all the users' credit classification automatically. The model's effective verification experiments are accomplished which are based on standard credit application dataset in UCI machine learning repository, and the statistical results of a comparative experiment with a single model of swarm intelligence clustering indicates this clustering ensemble model has a stronger creditworthiness distinguishing ability, especially in the aspect of predicting to find user clusters with the best credit and worst credit, which will facilitate the operators to take incentive measures or punitive measures accurately. Besides, compared with the experimental results of Logistic regression based model under the same conditions, this clustering ensemble model is robustness and has better prediction accuracy.
Clustering analysis of water distribution systems: identifying critical components and community impacts.

Science.gov (United States)

Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D

2014-01-01

Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.
Comparative analysis on the selection of number of clusters in community detection

Science.gov (United States)

Kawamoto, Tatsuro; Kabashima, Yoshiyuki

2018-02-01

We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.
Statistical Significance for Hierarchical Clustering

Science.gov (United States)

Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

2017-01-01

Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990
FLOCKING-BASED DOCUMENT CLUSTERING ON THE GRAPHICS PROCESSING UNIT [Book Chapter

Energy Technology Data Exchange (ETDEWEB)

Charles, J S; Patton, R M; Potok, T E; Cui, X

2008-01-01

Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the fl ocking behavior of birds. Each bird represents a single document and fl ies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly diffi cult to receive results in a reasonable amount of time. However, fl ocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have experienced improved performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefi t the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NVIDIA®, we developed a document fl ocking implementation to be run on the NVIDIA® GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3,000 documents. The results of these tests were very signifi cant. Performance gains ranged from three to nearly fi ve times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms.
Vector Nonlinear Time-Series Analysis of Gamma-Ray Burst Datasets on Heterogeneous Clusters

Directory of Open Access Journals (Sweden)

Ioana Banicescu

2005-01-01

Full Text Available The simultaneous analysis of a number of related datasets using a single statistical model is an important problem in statistical computing. A parameterized statistical model is to be fitted on multiple datasets and tested for goodness of fit within a fixed analytical framework. Definitive conclusions are hopefully achieved by analyzing the datasets together. This paper proposes a strategy for the efficient execution of this type of analysis on heterogeneous clusters. Based on partitioning processors into groups for efficient communications and a dynamic loop scheduling approach for load balancing, the strategy addresses the variability of the computational loads of the datasets, as well as the unpredictable irregularities of the cluster environment. Results from preliminary tests of using this strategy to fit gamma-ray burst time profiles with vector functional coefficient autoregressive models on 64 processors of a general purpose Linux cluster demonstrate the effectiveness of the strategy.
Analysis of the Motion Control Methods for Stratospheric Balloon-Borne Gondola Platform

International Nuclear Information System (INIS)

Wang, H H; Yuan, Z H; Wu, J

2006-01-01

At present, gondola platform is one of the stratospheric balloon-borne platforms being in research focus at home and overseas. Comparing to other stratospheric balloon-borne platforms, such as airship platform, gondola platform has advantages of higher stability, rapid in motion regulation and lower energy cost but disadvantages of less supporting capacity and be incapable of fixation. While all platforms have the same goal of keeping them at accurate angle and right pose for the requirements of instruments and objects installed in the platforms, when platforms rotate round the ground level perpendicular. That is accomplishing motion control. But, platform control system has factors of low damper, excessive and uncertain disturbances by the reason of its being hung over balloon in the air, it is hard to achieve the desired control precision because platform is ease to deviate its benchmark motion. Thus, in the controlling procedure in order to get higher precision, it is crucial to perceive the platform's swing synchronously and rapidly, and restrain the influence of disturbances effectively, keep the platform's pose steadily. Furthermore, while the platform in the air regard control center in the ground as reference object, it is ultimate to select a appropriate reference frame and work out the coordinates and implement the adjustment by the PC104 controller. This paper introduces the methods of the motion control based on stratospheric balloon-borne gondola platform. Firstly, this paper compares the characteristic of the flywheel and CMG and specifies the key methods of obtaining two significant states which are 'orientation stability' state and 'orientation tracking' state for platform motion control procedure using CMG as the control actuator. These two states reduce the deviation amplitude of rotation and swing of gondola's motion relative to original motion due to stratospheric intense atmosphere disturbance. We define it as the first procedure. In next
Analysis of the Motion Control Methods for Stratospheric Balloon-Borne Gondola Platform

Science.gov (United States)

Wang, H. H.; Yuan, Z. H.; Wu, J.

2006-10-01

At present, gondola platform is one of the stratospheric balloon-borne platforms being in research focus at home and overseas. Comparing to other stratospheric balloon-borne platforms, such as airship platform, gondola platform has advantages of higher stability, rapid in motion regulation and lower energy cost but disadvantages of less supporting capacity and be incapable of fixation. While all platforms have the same goal of keeping them at accurate angle and right pose for the requirements of instruments and objects installed in the platforms, when platforms rotate round the ground level perpendicular. That is accomplishing motion control. But, platform control system has factors of low damper, excessive and uncertain disturbances by the reason of its being hung over balloon in the air, it is hard to achieve the desired control precision because platform is ease to deviate its benchmark motion. Thus, in the controlling procedure in order to get higher precision, it is crucial to perceive the platform's swing synchronously and rapidly, and restrain the influence of disturbances effectively, keep the platform's pose steadily. Furthermore, while the platform in the air regard control center in the ground as reference object, it is ultimate to select a appropriate reference frame and work out the coordinates and implement the adjustment by the PC104 controller. This paper introduces the methods of the motion control based on stratospheric balloon-borne gondola platform. Firstly, this paper compares the characteristic of the flywheel and CMG and specifies the key methods of obtaining two significant states which are 'orientation stability' state and 'orientation tracking' state for platform motion control procedure using CMG as the control actuator. These two states reduce the deviation amplitude of rotation and swing of gondola's motion relative to original motion due to stratospheric intense atmosphere disturbance. We define it as the first procedure. In next
Parallelization and scheduling of data intensive particle physics analysis jobs on clusters of PCs

CERN Document Server

Ponce, S

2004-01-01

Summary form only given. Scheduling policies are proposed for parallelizing data intensive particle physics analysis applications on computer clusters. Particle physics analysis jobs require the analysis of tens of thousands of particle collision events, each event requiring typically 200ms processing time and 600KB of data. Many jobs are launched concurrently by a large number of physicists. At a first view, particle physics jobs seem to be easy to parallelize, since particle collision events can be processed independently one from another. However, since large amounts of data need to be accessed, the real challenge resides in making an efficient use of the underlying computing resources. We propose several job parallelization and scheduling policies aiming at reducing job processing times and at increasing the sustainable load of a cluster server. Since particle collision events are usually reused by several jobs, cache based job splitting strategies considerably increase cluster utilization and reduce job ...
Developing a New Wireless Sensor Network Platform and Its Application in Precision Agriculture

Science.gov (United States)

Aquino-Santos, Raúl; González-Potes, Apolinar; Edwards-Block, Arthur; Virgen-Ortiz, Raúl Alejandro

2011-01-01

Wireless sensor networks are gaining greater attention from the research community and industrial professionals because these small pieces of “smart dust” offer great advantages due to their small size, low power consumption, easy integration and support for “green” applications. Green applications are considered a hot topic in intelligent environments, ubiquitous and pervasive computing. This work evaluates a new wireless sensor network platform and its application in precision agriculture, including its embedded operating system and its routing algorithm. To validate the technological platform and the embedded operating system, two different routing strategies were compared: hierarchical and flat. Both of these routing algorithms were tested in a small-scale network applied to a watermelon field. However, we strongly believe that this technological platform can be also applied to precision agriculture because it incorporates a modified version of LORA-CBF, a wireless location-based routing algorithm that uses cluster-based flooding. Cluster-based flooding addresses the scalability concerns of wireless sensor networks, while the modified LORA-CBF routing algorithm includes a metric to monitor residual battery energy. Furthermore, results show that the modified version of LORA-CBF functions well with both the flat and hierarchical algorithms, although it functions better with the flat algorithm in a small-scale agricultural network. PMID:22346622
Developing a new wireless sensor network platform and its application in precision agriculture.

Science.gov (United States)

Aquino-Santos, Raúl; González-Potes, Apolinar; Edwards-Block, Arthur; Virgen-Ortiz, Raúl Alejandro

2011-01-01

Wireless sensor networks are gaining greater attention from the research community and industrial professionals because these small pieces of "smart dust" offer great advantages due to their small size, low power consumption, easy integration and support for "green" applications. Green applications are considered a hot topic in intelligent environments, ubiquitous and pervasive computing. This work evaluates a new wireless sensor network platform and its application in precision agriculture, including its embedded operating system and its routing algorithm. To validate the technological platform and the embedded operating system, two different routing strategies were compared: hierarchical and flat. Both of these routing algorithms were tested in a small-scale network applied to a watermelon field. However, we strongly believe that this technological platform can be also applied to precision agriculture because it incorporates a modified version of LORA-CBF, a wireless location-based routing algorithm that uses cluster-based flooding. Cluster-based flooding addresses the scalability concerns of wireless sensor networks, while the modified LORA-CBF routing algorithm includes a metric to monitor residual battery energy. Furthermore, results show that the modified version of LORA-CBF functions well with both the flat and hierarchical algorithms, although it functions better with the flat algorithm in a small-scale agricultural network.
The association between mood state and chronobiological characteristics in bipolar I disorder: a naturalistic, variable cluster analysis-based study.

Science.gov (United States)

Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres

2018-02-19

Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.
A Novel Double Cluster and Principal Component Analysis-Based Optimization Method for the Orbit Design of Earth Observation Satellites

Directory of Open Access Journals (Sweden)

Yunfeng Dong

2017-01-01

Full Text Available The weighted sum and genetic algorithm-based hybrid method (WSGA-based HM, which has been applied to multiobjective orbit optimizations, is negatively influenced by human factors through the artificial choice of the weight coefficients in weighted sum method and the slow convergence of GA. To address these two problems, a cluster and principal component analysis-based optimization method (CPC-based OM is proposed, in which many candidate orbits are gradually randomly generated until the optimal orbit is obtained using a data mining method, that is, cluster analysis based on principal components. Then, the second cluster analysis of the orbital elements is introduced into CPC-based OM to improve the convergence, developing a novel double cluster and principal component analysis-based optimization method (DCPC-based OM. In DCPC-based OM, the cluster analysis based on principal components has the advantage of reducing the human influences, and the cluster analysis based on six orbital elements can reduce the search space to effectively accelerate convergence. The test results from a multiobjective numerical benchmark function and the orbit design results of an Earth observation satellite show that DCPC-based OM converges more efficiently than WSGA-based HM. And DCPC-based OM, to some degree, reduces the influence of human factors presented in WSGA-based HM.
Fuzzy cluster means algorithm for the diagnosis of confusable disease

African Journals Online (AJOL)

... end platform while Microsoft Access was used as the database application. The system gives a measure of each disease within a set of confusable disease. The proposed system had a classification accuracy of 60%. Keywords: Artificial Intelligence, expert system Fuzzy cluster – means Algorithm, physician, Diagnosis ...
Cluster Analysis of Flow Cytometric List Mode Data on a Personal Computer

NARCIS (Netherlands)

Bakker Schut, Tom C.; Bakker schut, T.C.; de Grooth, B.G.; Greve, Jan

1993-01-01

A cluster analysis algorithm, dedicated to analysis of flow cytometric data is described. The algorithm is written in Pascal and implemented on an MS-DOS personal computer. It uses k-means, initialized with a large number of seed points, followed by a modified nearest neighbor technique to reduce
Intelligent Tools for Building a Scientific Information Platform

CERN Document Server

Skonieczny, Lukasz; Rybiński, Henryk; Niezgodka, Marek

2012-01-01

This book is a selection of results obtained within one year of research performed under SYNAT - a nation-wide scientific project aiming to create an infrastructure for scientific content storage and sharing for academia, education and open knowledge society in Poland. The selection refers to the research in artificial intelligence, knowledge discovery and data mining, information retrieval and natural language processing, addressing the problems of implementing intelligent tools for building a scientific information platform. The idea of this book is based on the very successful SYNAT Project Conference and the SYNAT Workshop accompanying the 19th International Symposium on Methodologies for Intelligent Systems (ISMIS 2011). The papers included in this book present an overview and insight into such topics as architecture of scientific information platforms, semantic clustering, ontology-based systems, as well as, multimedia data processing.
Methodology for development of risk indicators for offshore platforms

International Nuclear Information System (INIS)

Oeien, K.; Sklet, S.

1999-01-01

This paper presents a generic methodology for development of risk indicators for petroleum installations and a specific set of risk indicators established for one offshore platform. The risk indicators should be used to control the risk during operation of platforms. The methodology is purely risk-based and the basis for development of risk indicators is the platform specific quantitative risk analysis (QRA). In order to identify high risk contributing factors, platform personnel are asked to assess whether and how much the risk influencing factors will change. A brief comparison of probabilistic safety assessment (PSA) for nuclear power plants and quantitative risk analysis (QRA) for petroleum platforms is also given. (au)

Some links on this page may take you to non-federal websites. Their policies may differ from this site.