WorldWideScience

Sample records for data analysis

  1. Practical data analysis

    CERN Document Server

    Cuesta, Hector

    2013-01-01

    Each chapter of the book quickly introduces a key 'theme' of Data Analysis, before immersing you in the practical aspects of each theme. You'll learn quickly how to perform all aspects of Data Analysis.Practical Data Analysis is a book ideal for home and small business users who want to slice & dice the data they have on hand with minimum hassle.

  2. Data analysis workbench

    International Nuclear Information System (INIS)

    Goetz, A.; Gerring, M.; Svensson, O.; Brockhauser, S.

    2012-01-01

    Data Analysis Workbench (DAWB) is a new software tool being developed at the ESRF. Its goal is to provide a tool for both online data analysis which can be used on the beamlines and for offline data analysis which users can use during experiments or take home. The tool includes support for data visualization and work-flows. work-flows allow algorithms which exploit parallel architectures to be designed from existing high level modules for data analysis in combination with data collection. The workbench uses Passerelle as the work-flow engine and EDNA plug-ins for data analysis. Actors talking to Tango are used for sending commands to a limited set of hardware to start existing data collection algorithms. A Tango server allows work-flows to be executed from existing applications. There are scripting interfaces to Python, Javascript and SPEC. The current state at the ESRF is the workbench is in test on a selected number of beamlines. (authors)

  3. Mastering Clojure data analysis

    CERN Document Server

    Rochester, Eric

    2014-01-01

    This book consists of a practical, example-oriented approach that aims to help you learn how to use Clojure for data analysis quickly and efficiently.This book is great for those who have experience with Clojure and who need to use it to perform data analysis. This book will also be hugely beneficial for readers with basic experience in data analysis and statistics.

  4. Functional data analysis

    CERN Document Server

    Ramsay, J O

    1997-01-01

    Scientists today collect samples of curves and other functional observations. This monograph presents many ideas and techniques for such data. Included are expressions in the functional domain of such classics as linear regression, principal components analysis, linear modelling, and canonical correlation analysis, as well as specifically functional techniques such as curve registration and principal differential analysis. Data arising in real applications are used throughout for both motivation and illustration, showing how functional approaches allow us to see new things, especially by exploiting the smoothness of the processes generating the data. The data sets exemplify the wide scope of functional data analysis; they are drwan from growth analysis, meterology, biomechanics, equine science, economics, and medicine. The book presents novel statistical technology while keeping the mathematical level widely accessible. It is designed to appeal to students, to applied data analysts, and to experienced researc...

  5. Statistical data analysis

    International Nuclear Information System (INIS)

    Hahn, A.A.

    1994-11-01

    The complexity of instrumentation sometimes requires data analysis to be done before the result is presented to the control room. This tutorial reviews some of the theoretical assumptions underlying the more popular forms of data analysis and presents simple examples to illuminate the advantages and hazards of different techniques

  6. Learning Haskell data analysis

    CERN Document Server

    Church, James

    2015-01-01

    If you are a developer, analyst, or data scientist who wants to learn data analysis methods using Haskell and its libraries, then this book is for you. Prior experience with Haskell and a basic knowledge of data science will be beneficial.

  7. Chapter 8. Data Analysis

    Science.gov (United States)

    Lyman L. McDonald; Christina D. Vojta; Kevin S. McKelvey

    2013-01-01

    Perhaps the greatest barrier between monitoring and management is data analysis. Data languish in drawers and spreadsheets because those who collect or maintain monitoring data lack training in how to effectively summarize and analyze their findings. This chapter serves as a first step to surmounting that barrier by empowering any monitoring team with the basic...

  8. Python data analysis

    CERN Document Server

    Idris, Ivan

    2014-01-01

    This book is for programmers, scientists, and engineers who have knowledge of the Python language and know the basics of data science. It is for those who wish to learn different data analysis methods using Python and its libraries. This book contains all the basic ingredients you need to become an expert data analyst.

  9. Bayesian nonparametric data analysis

    CERN Document Server

    Müller, Peter; Jara, Alejandro; Hanson, Tim

    2015-01-01

    This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.

  10. Fuzzy data analysis

    CERN Document Server

    Bandemer, Hans

    1992-01-01

    Fuzzy data such as marks, scores, verbal evaluations, imprecise observations, experts' opinions and grey tone pictures, are quite common. In Fuzzy Data Analysis the authors collect their recent results providing the reader with ideas, approaches and methods for processing such data when looking for sub-structures in knowledge bases for an evaluation of functional relationship, e.g. in order to specify diagnostic or control systems. The modelling presented uses ideas from fuzzy set theory and the suggested methods solve problems usually tackled by data analysis if the data are real numbers. Fuzzy Data Analysis is self-contained and is addressed to mathematicians oriented towards applications and to practitioners in any field of application who have some background in mathematics and statistics.

  11. Longitudinal categorical data analysis

    CERN Document Server

    Sutradhar, Brajendra C

    2014-01-01

    This is the first book in longitudinal categorical data analysis with parametric correlation models developed based on dynamic relationships among repeated categorical responses. This book is a natural generalization of the longitudinal binary data analysis to the multinomial data setup with more than two categories. Thus, unlike the existing books on cross-sectional categorical data analysis using log linear models, this book uses multinomial probability models both in cross-sectional and longitudinal setups. A theoretical foundation is provided for the analysis of univariate multinomial responses, by developing models systematically for the cases with no covariates as well as categorical covariates, both in cross-sectional and longitudinal setups. In the longitudinal setup, both stationary and non-stationary covariates are considered. These models have also been extended to the bivariate multinomial setup along with suitable covariates. For the inferences, the book uses the generalized quasi-likelihood as w...

  12. Qualitative Data Analysis Strategies

    OpenAIRE

    Greaves, Kristoffer

    2014-01-01

    A set of concept maps for qualitative data analysis strategies, inspired by Corbin, JM & Strauss, AL 2008, Basics of qualitative research: Techniques and procedures for developing grounded theory, 3rd edn, Sage Publications, Inc, Thousand Oaks, California.

  13. Statistical data analysis handbook

    National Research Council Canada - National Science Library

    Wall, Francis J

    1986-01-01

    It must be emphasized that this is not a text book on statistics. Instead it is a working tool that presents data analysis in clear, concise terms which can be readily understood even by those without formal training in statistics...

  14. Workbook on data analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hopke, P K [Department of Chemistry, Clarkson Univ., Potsdam, NY (United States)

    2000-07-01

    As a consequence of various IAEA programmes to sample airborne particulate matter and determine its elemental composition, the participating research groups are accumulating data on the composition of the atmospheric aerosol. It is necessary to consider ways in which these data can be utilized in order to be certain that the data obtained are correct and that the information then being transmitted to others who may make decisions based on such information is as representative and correct as possible. In order to both examine the validity of those data and extract appropriate information from them, it is necessary to utilize a variety of data analysis methods. The objective of this workbook is to provide a guide with examples of utilizing data analysis on airborne particle composition data using a spreadsheet program (EXCEL) and a personal computer based statistical package (StatGraphics)

  15. Workbook on data analysis

    International Nuclear Information System (INIS)

    Hopke, P.K.

    2000-01-01

    As a consequence of various IAEA programmes to sample airborne particulate matter and determine its elemental composition, the participating research groups are accumulating data on the composition of the atmospheric aerosol. It is necessary to consider ways in which these data can be utilized in order to be certain that the data obtained are correct and that the information then being transmitted to others who may make decisions based on such information is as representative and correct as possible. In order to both examine the validity of those data and extract appropriate information from them, it is necessary to utilize a variety of data analysis methods. The objective of this workbook is to provide a guide with examples of utilizing data analysis on airborne particle composition data using a spreadsheet program (EXCEL) and a personal computer based statistical package (StatGraphics)

  16. The data analysis handbook

    CERN Document Server

    Frank, IE

    1994-01-01

    Analyzing observed or measured data is an important step in applied sciences. The recent increase in computer capacity has resulted in a revolution both in data collection and data analysis. An increasing number of scientists, researchers and students are venturing into statistical data analysis; hence the need for more guidance in this field, which was previously dominated mainly by statisticians. This handbook fills the gap in the range of textbooks on data analysis. Written in a dictionary format, it will serve as a comprehensive reference book in a rapidly growing field. However, this book is more structured than an ordinary dictionary, where each entry is a separate, self-contained entity. The authors provide not only definitions and short descriptions, but also offer an overview of the different topics. Therefore, the handbook can also be used as a companion to textbooks for undergraduate or graduate courses. 1700 entries are given in alphabetical order grouped into 20 topics and each topic is organized...

  17. Haskell data analysis cookbook

    CERN Document Server

    Shukla, Nishant

    2014-01-01

    Step-by-step recipes filled with practical code samples and engaging examples demonstrate Haskell in practice, and then the concepts behind the code. This book shows functional developers and analysts how to leverage their existing knowledge of Haskell specifically for high-quality data analysis. A good understanding of data sets and functional programming is assumed.

  18. Efficient Incremental Data Analysis

    OpenAIRE

    Nikolic, Milos

    2016-01-01

    Many data-intensive applications require real-time analytics over streaming data. In a growing number of domains -- sensor network monitoring, social web applications, clickstream analysis, high-frequency algorithmic trading, and fraud detections to name a few -- applications continuously monitor stream events to promptly react to certain data conditions. These applications demand responsive analytics even when faced with high volume and velocity of incoming changes, large numbers of users, a...

  19. Analysis of neural data

    CERN Document Server

    Kass, Robert E; Brown, Emery N

    2014-01-01

    Continual improvements in data collection and processing have had a huge impact on brain research, producing data sets that are often large and complicated. By emphasizing a few fundamental principles, and a handful of ubiquitous techniques, Analysis of Neural Data provides a unified treatment of analytical methods that have become essential for contemporary researchers. Throughout the book ideas are illustrated with more than 100 examples drawn from the literature, ranging from electrophysiology, to neuroimaging, to behavior. By demonstrating the commonality among various statistical approaches the authors provide the crucial tools for gaining knowledge from diverse types of data. Aimed at experimentalists with only high-school level mathematics, as well as computationally-oriented neuroscientists who have limited familiarity with statistics, Analysis of Neural Data serves as both a self-contained introduction and a reference work.

  20. Exascale Data Analysis

    CERN Multimedia

    CERN. Geneva; Fitch, Blake

    2011-01-01

    Traditionaly, the primary role of supercomputers was to create data, primarily for simulation applications. Due to usage and technology trends, supercomputers are increasingly also used for data analysis. Some of this data is from simulations, but there is also a rapidly increasingly amount of real-world science and business data to be analyzed. We briefly overview Blue Gene and other current supercomputer architectures. We outline future architectures, up to the Exascale supercomputers expected in the 2020 time frame. We focus on the data analysis challenges and opportunites, especially those concerning Flash and other up-and-coming storage class memory. About the speakers Blake G. Fitch has been with IBM Research, Yorktown Heights, NY since 1987, mainly pursuing interests in parallel systems. He joined the Scalable Parallel Systems Group in 1990, contributing to research and development that culminated in the IBM scalable parallel system (SP*) product. His research interests have focused on applicatio...

  1. Highdimensional data analysis

    CERN Document Server

    Cai, Tony

    2010-01-01

    Over the last few years, significant developments have been taking place in highdimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from highdimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, cla

  2. FERRET data analysis code

    International Nuclear Information System (INIS)

    Schmittroth, F.

    1979-09-01

    A documentation of the FERRET data analysis code is given. The code provides a way to combine related measurements and calculations in a consistent evaluation. Basically a very general least-squares code, it is oriented towards problems frequently encountered in nuclear data and reactor physics. A strong emphasis is on the proper treatment of uncertainties and correlations and in providing quantitative uncertainty estimates. Documentation includes a review of the method, structure of the code, input formats, and examples

  3. Thematic mapper data analysis

    Science.gov (United States)

    Settle, M.; Chavez, P.; Kieffer, H. H.; Everett, J. R.; Kahle, A. B.; Kitcho, C. A.; Milton, N. M.; Mouat, D. A.

    1983-01-01

    The geological applications of remote sensing technology are discussed, with emphasis given to the analysis of data from the Thematic Mapper (TM) instrument onboard the Landsat 4 satellite. The flight history and design characteristics of the Landsat 4/TM are reviewed, and some difficulties endountered in the interpretation of raw TM data are discussed, including: the volume of data; residual noise; detector-to-detector striping; and spatial misregistration between measurements. Preliminary results of several geological, lithological, geobotanical mapping experiments are presented as examples of the geological applications of the TM, and some areas for improving the guality of TM imagery are identified.

  4. Data analysis with Mplus

    CERN Document Server

    Geiser, Christian

    2012-01-01

    A practical introduction to using Mplus for the analysis of multivariate data, this volume provides step-by-step guidance, complete with real data examples, numerous screen shots, and output excerpts. The author shows how to prepare a data set for import in Mplus using SPSS. He explains how to specify different types of models in Mplus syntax and address typical caveats--for example, assessing measurement invariance in longitudinal SEMs. Coverage includes path and factor analytic models as well as mediational, longitudinal, multilevel, and latent class models. Specific programming tips an

  5. SP mountain data analysis

    Science.gov (United States)

    Rawson, R. F.; Hamilton, R. E.; Liskow, C. L.; Dias, A. R.; Jackson, P. L.

    1981-01-01

    An analysis of synthetic aperture radar data of SP Mountain was undertaken to demonstrate the use of digital image processing techniques to aid in geologic interpretation of SAR data. These data were collected with the ERIM X- and L-band airborne SAR using like- and cross-polarizations. The resulting signal films were used to produce computer compatible tapes, from which four-channel imagery was generated. Slant range-to-ground range and range-azimuth-scale corrections were made in order to facilitate image registration; intensity corrections were also made. Manual interpretation of the imagery showed that L-band represented the geology of the area better than X-band. Several differences between the various images were also noted. Further digital analysis of the corrected data was done for enhancement purposes. This analysis included application of an MSS differencing routine and development of a routine for removal of relief displacement. It was found that accurate registration of the SAR channels is critical to the effectiveness of the differencing routine. Use of the relief displacement algorithm on the SP Mountain data demonstrated the feasibility of the technique.

  6. Data Mining and Analysis

    Science.gov (United States)

    Samms, Kevin O.

    2015-01-01

    The Data Mining project seeks to bring the capability of data visualization to NASA anomaly and problem reporting systems for the purpose of improving data trending, evaluations, and analyses. Currently NASA systems are tailored to meet the specific needs of its organizations. This tailoring has led to a variety of nomenclatures and levels of annotation for procedures, parts, and anomalies making difficult the realization of the common causes for anomalies. Making significant observations and realizing the connection between these causes without a common way to view large data sets is difficult to impossible. In the first phase of the Data Mining project a portal was created to present a common visualization of normalized sensitive data to customers with the appropriate security access. The tool of the visualization itself was also developed and fine-tuned. In the second phase of the project we took on the difficult task of searching and analyzing the target data set for common causes between anomalies. In the final part of the second phase we have learned more about how much of the analysis work will be the job of the Data Mining team, how to perform that work, and how that work may be used by different customers in different ways. In this paper I detail how our perspective has changed after gaining more insight into how the customers wish to interact with the output and how that has changed the product.

  7. Data Analysis Facility (DAF)

    Science.gov (United States)

    1991-01-01

    NASA-Dryden's Data Analysis Facility (DAF) provides a variety of support services to the entire Dryden community. It provides state-of-the-art hardware and software systems, available to any Dryden engineer for pre- and post-flight data processing and analysis, plus supporting all archival and general computer use. The Flight Data Access System (FDAS) is one of the advanced computer systems in the DAF, providing for fast engineering unit conversion and archival processing of flight data delivered from the Western Aeronautical Test Range. Engineering unit conversion and archival formatting of flight data is performed by the DRACO program on a Sun 690MP and an E-5000 computer. Time history files produced by DRACO are then moved to a permanent magneto-optical archive, where they are network-accessible 24 hours a day, 7 days a week. Pertinent information about the individual flights is maintained in a relational (Sybase) database. The DAF also houses all general computer services, including; the Compute Server 1 and 2 (CS1 and CS2), the server for the World Wide Web, overall computer operations support, courier service, a CD-ROM Writer system, a Technical Support Center, the NASA Dryden Phone System (NDPS), and Hardware Maintenance.

  8. Hurricane Data Analysis Tool

    Science.gov (United States)

    Liu, Zhong; Ostrenga, Dana; Leptoukh, Gregory

    2011-01-01

    In order to facilitate Earth science data access, the NASA Goddard Earth Sciences Data Information Services Center (GES DISC) has developed a web prototype, the Hurricane Data Analysis Tool (HDAT; URL: http://disc.gsfc.nasa.gov/HDAT), to allow users to conduct online visualization and analysis of several remote sensing and model datasets for educational activities and studies of tropical cyclones and other weather phenomena. With a web browser and few mouse clicks, users can have a full access to terabytes of data and generate 2-D or time-series plots and animation without downloading any software and data. HDAT includes data from the NASA Tropical Rainfall Measuring Mission (TRMM), the NASA Quick Scatterometer(QuikSCAT) and NECP Reanalysis, and the NCEP/CPC half-hourly, 4-km Global (60 N - 60 S) IR Dataset. The GES DISC archives TRMM data. The daily global rainfall product derived from the 3-hourly multi-satellite precipitation product (3B42 V6) is available in HDAT. The TRMM Microwave Imager (TMI) sea surface temperature from the Remote Sensing Systems is in HDAT as well. The NASA QuikSCAT ocean surface wind and the NCEP Reanalysis provide ocean surface and atmospheric conditions, respectively. The global merged IR product, also known as, the NCEP/CPC half-hourly, 4-km Global (60 N -60 S) IR Dataset, is one of TRMM ancillary datasets. They are globally-merged pixel-resolution IR brightness temperature data (equivalent blackbody temperatures), merged from all available geostationary satellites (GOES-8/10, METEOSAT-7/5 & GMS). The GES DISC has collected over 10 years of the data beginning from February of 2000. This high temporal resolution (every 30 minutes) dataset not only provides additional background information to TRMM and other satellite missions, but also allows observing a wide range of meteorological phenomena from space, such as, hurricanes, typhoons, tropical cyclones, mesoscale convection system, etc. Basic functions include selection of area of

  9. Big climate data analysis

    Science.gov (United States)

    Mudelsee, Manfred

    2015-04-01

    The Big Data era has begun also in the climate sciences, not only in economics or molecular biology. We measure climate at increasing spatial resolution by means of satellites and look farther back in time at increasing temporal resolution by means of natural archives and proxy data. We use powerful supercomputers to run climate models. The model output of the calculations made for the IPCC's Fifth Assessment Report amounts to ~650 TB. The 'scientific evolution' of grid computing has started, and the 'scientific revolution' of quantum computing is being prepared. This will increase computing power, and data amount, by several orders of magnitude in the future. However, more data does not automatically mean more knowledge. We need statisticians, who are at the core of transforming data into knowledge. Statisticians notably also explore the limits of our knowledge (uncertainties, that is, confidence intervals and P-values). Mudelsee (2014 Climate Time Series Analysis: Classical Statistical and Bootstrap Methods. Second edition. Springer, Cham, xxxii + 454 pp.) coined the term 'optimal estimation'. Consider the hyperspace of climate estimation. It has many, but not infinite, dimensions. It consists of the three subspaces Monte Carlo design, method and measure. The Monte Carlo design describes the data generating process. The method subspace describes the estimation and confidence interval construction. The measure subspace describes how to detect the optimal estimation method for the Monte Carlo experiment. The envisaged large increase in computing power may bring the following idea of optimal climate estimation into existence. Given a data sample, some prior information (e.g. measurement standard errors) and a set of questions (parameters to be estimated), the first task is simple: perform an initial estimation on basis of existing knowledge and experience with such types of estimation problems. The second task requires the computing power: explore the hyperspace to

  10. Beginning statistics with data analysis

    CERN Document Server

    Mosteller, Frederick; Rourke, Robert EK

    2013-01-01

    This introduction to the world of statistics covers exploratory data analysis, methods for collecting data, formal statistical inference, and techniques of regression and analysis of variance. 1983 edition.

  11. Uncertain data envelopment analysis

    CERN Document Server

    Wen, Meilin

    2014-01-01

    This book is intended to present the milestones in the progression of uncertain Data envelopment analysis (DEA). Chapter 1 gives some basic introduction to uncertain theories, including probability theory, credibility theory, uncertainty theory and chance theory. Chapter 2 presents a comprehensive review and discussion of basic DEA models. The stochastic DEA is introduced in Chapter 3, in which the inputs and outputs are assumed to be random variables. To obtain the probability distribution of a random variable, a lot of samples are needed to apply the statistics inference approach. Chapter 4

  12. Descriptive data analysis.

    Science.gov (United States)

    Thompson, Cheryl Bagley

    2009-01-01

    This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.

  13. Access Data Analysis Cookbook

    CERN Document Server

    Bluttman, Ken

    2008-01-01

    This book offers practical recipes to solve a variety of common problems that users have with extracting Access data and performing calculations on it. Whether you use Access 2007 or an earlier version, this book will teach you new methods to query data, different ways to move data in and out of Access, how to calculate answers to financial and investment issues, how to jump beyond SQL by manipulating data with VBA, and more.

  14. Geospatial Data Analysis Facility

    Data.gov (United States)

    Federal Laboratory Consortium — Geospatial application development, location-based services, spatial modeling, and spatial analysis are examples of the many research applications that this facility...

  15. Panel data analysis of cardiotocograph (CTG) data.

    Science.gov (United States)

    Horio, Hiroyuki; Kikuchi, Hitomi; Ikeda, Tomoaki

    2013-01-01

    Panel data analysis is a statistical method, widely used in econometrics, which deals with two-dimensional panel data collected over time and over individuals. Cardiotocograph (CTG) which monitors fetal heart rate (FHR) using Doppler ultrasound and uterine contraction by strain gage is commonly used in intrapartum treatment of pregnant women. Although the relationship between FHR waveform pattern and the outcome such as umbilical blood gas data at delivery has long been analyzed, there exists no accumulated FHR patterns from large number of cases. As time-series economic fluctuations in econometrics such as consumption trend has been studied using panel data which consists of time-series and cross-sectional data, we tried to apply this method to CTG data. The panel data composed of a symbolized segment of FHR pattern can be easily handled, and a perinatologist can get the whole FHR pattern view from the microscopic level of time-series FHR data.

  16. R data analysis cookbook

    CERN Document Server

    Viswanathan, Viswa

    2015-01-01

    This book is ideal for those who are already exposed to R, but have not yet used it extensively for data analytics and are seeking to get up and running quickly for analytics tasks. This book will help people who aspire to enhance their skills in any of the following ways: perform advanced analyses and create informative and professional charts become proficient in acquiring data from many sources apply supervised and unsupervised data mining techniques use R's features to present analyses professionally

  17. Social Data Analysis Tool

    DEFF Research Database (Denmark)

    Hussain, Abid; Vatrapu, Ravi; Hardt, Daniel

    2014-01-01

    , analyze and visualize patterns of web activity. This volume profiles the latest techniques being employed by social scientists to collect and interpret data from some of the most popular social media applications, the political parties' own online activist spaces, and the wider system of hyperlinks...... and analyze web data in the process of investigating substantive questions....

  18. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2008-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  19. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2002-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  20. Analysis of Panel Data

    Science.gov (United States)

    Hsiao, Cheng

    2003-02-01

    Panel data models have become increasingly popular among applied researchers due to their heightened capacity for capturing the complexity of human behavior, as compared to cross-sectional or time series data models. This second edition represents a substantial revision of the highly successful first edition (1986). Recent advances in panel data research are presented in an accessible manner and are carefully integrated with the older material. The thorough discussion of theory and the judicious use of empirical examples make this book useful to graduate students and advanced researchers in economics, business, sociology and political science.

  1. Data Analysis Challenges

    Science.gov (United States)

    2008-12-01

    must provide high system bandwidth and exascale data storage. A robust network interconnection is essential to achieve high bandwidth, low latency...29] that a good interconnect topology is essential to fault-tolerance of a exascale storage system. System architects are building ever-larger data...attractive in boosting system performance, component failures are now the rule rather than the exception. In an exascale storage system with thousands of

  2. Bitcoin data analysis

    OpenAIRE

    Raventós, Higinio; Anadón Rosinach, Marta

    2012-01-01

    This paper analyses 26 time series that measure daily data for different attributes of the Bitcoin network and studies how the virtual currency behaves compared to a basket of currencies containing the Brazil Real (BRL), the Chinese Yuan (CNY), the Euro (EUR), and the Japan Yen (JPY) against the US Dollar (USD). Basic statistics about the time series have been taken and stationarity has been studied in order to build sterilized fact data and meaningful cointegrations have been found among ...

  3. Bayesian data analysis for newcomers.

    Science.gov (United States)

    Kruschke, John K; Liddell, Torrin M

    2018-02-01

    This article explains the foundational concepts of Bayesian data analysis using virtually no mathematical notation. Bayesian ideas already match your intuitions from everyday reasoning and from traditional data analysis. Simple examples of Bayesian data analysis are presented that illustrate how the information delivered by a Bayesian analysis can be directly interpreted. Bayesian approaches to null-value assessment are discussed. The article clarifies misconceptions about Bayesian methods that newcomers might have acquired elsewhere. We discuss prior distributions and explain how they are not a liability but an important asset. We discuss the relation of Bayesian data analysis to Bayesian models of mind, and we briefly discuss what methodological problems Bayesian data analysis is not meant to solve. After you have read this article, you should have a clear sense of how Bayesian data analysis works and the sort of information it delivers, and why that information is so intuitive and useful for drawing conclusions from data.

  4. Auroral Data Analysis

    Science.gov (United States)

    1979-01-31

    but expinds ’acordiohlike.’ (4) The height- integrated intensity ratio of the red (6300 A) to green (5577 A) emisions of atomic o\\)gen is a good... molecular ion: Analysis of two rocket experiments, Planet. Space Sci. 16, 737, 1968. Hays, P. B. and C. D. Anger, The influence of ground scattering on

  5. CADDIS Volume 4. Data Analysis: Exploratory Data Analysis

    Science.gov (United States)

    Intro to exploratory data analysis. Overview of variable distributions, scatter plots, correlation analysis, GIS datasets. Use of conditional probability to examine stressor levels and impairment. Exploring correlations among multiple stressors.

  6. Surface Temperature Data Analysis

    Science.gov (United States)

    Hansen, James; Ruedy, Reto

    2012-01-01

    Small global mean temperature changes may have significant to disastrous consequences for the Earth's climate if they persist for an extended period. Obtaining global means from local weather reports is hampered by the uneven spatial distribution of the reliably reporting weather stations. Methods had to be developed that minimize as far as possible the impact of that situation. This software is a method of combining temperature data of individual stations to obtain a global mean trend, overcoming/estimating the uncertainty introduced by the spatial and temporal gaps in the available data. Useful estimates were obtained by the introduction of a special grid, subdividing the Earth's surface into 8,000 equal-area boxes, using the existing data to create virtual stations at the center of each of these boxes, and combining temperature anomalies (after assessing the radius of high correlation) rather than temperatures.

  7. Multivariate data analysis

    Digital Repository Service at National Institute of Oceanography (India)

    Fernandes, A.A.; Antony, M.K.; Somayajulu, Y.K.; Sarma, Y.V.B.; Almeida, A.M.; Mahadevan, R.

    , Head Applied Statistics Unit, Indian Statistical Institute, Calcutta for going through the section on Canonical Correlation Analysis and offering his comments on the same. This report has been prepared using ?Latex? on a ?Linux? platform, viz...., the personal computer Kapila. I wish to thank Mr. Dattaram Shivji for installing ?Latex? and ?GMT? packages on the personal computer. The style file used for preparing this report, has been hacked by me from a Goa University, Ph. D style file prepared by Dr. D...

  8. Wave Data Analysis

    DEFF Research Database (Denmark)

    Alikhani, Amir; Frigaard, Peter; Burcharth, Hans F.

    1998-01-01

    The data collected over the course of the experiment must be analysed and converted into a form suitable for its intended use. Type of analyses range from simple to sophisticated. Depending on the particular experiment and the needs of the researcher. In this study three main part of irregular wa...

  9. Analysis of Juggling Data

    DEFF Research Database (Denmark)

    Tolver, Anders; Sørensen, Helle; Muller, Martha

    2014-01-01

    Abstract We illustrate how physical constraints of a biomechanical system can be taken into account when registering functional data from juggling trials. We define an idealized model of juggling, based on a periodic joint movement in a low-dimensional space and a periodic position vector (from...

  10. Panel data analysis using EViews

    CERN Document Server

    Agung, I Gusti Ngurah

    2013-01-01

    A comprehensive and accessible guide to panel data analysis using EViews software This book explores the use of EViews software in creating panel data analysis using appropriate empirical models and real datasets. Guidance is given on developing alternative descriptive statistical summaries for evaluation and providing policy analysis based on pool panel data. Various alternative models based on panel data are explored, including univariate general linear models, fixed effect models and causal models, and guidance on the advantages and disadvantages of each one is given. Panel Data Analysis

  11. Bow shock data analysis

    Science.gov (United States)

    Zipf, Edward C.; Erdman, Peeter W.

    1994-08-01

    The University of Pittsburgh Space Physics Group in collaboration with the Army Research Office (ARO) modeling team has completed a systematic organization of the shock and plume spectral data and the electron temperature and density measurements obtained during the BowShock I and II rocket flights which have been submitted to the AEDC Data Center, has verified the presence of CO Cameron band emission during the Antares engine burn and for an extended period of time in the post-burn plume, and have adapted 3-D radiation entrapment codes developed by the University of Pittsburgh to study aurora and other atmospheric phenomena that involve significant spatial effects to investigate the vacuum ultraviolet (VUV) and extreme ultraviolet (EUV) envelope surrounding the re-entry that create an extensive plasma cloud by photoionization.

  12. Data-variant kernel analysis

    CERN Document Server

    Motai, Yuichi

    2015-01-01

    Describes and discusses the variants of kernel analysis methods for data types that have been intensely studied in recent years This book covers kernel analysis topics ranging from the fundamental theory of kernel functions to its applications. The book surveys the current status, popular trends, and developments in kernel analysis studies. The author discusses multiple kernel learning algorithms and how to choose the appropriate kernels during the learning phase. Data-Variant Kernel Analysis is a new pattern analysis framework for different types of data configurations. The chapters include

  13. The ACIGA data analysis programme

    International Nuclear Information System (INIS)

    Scott, Susan M; Searle, Antony C; Cusack, Benedict J; McClelland, David E

    2004-01-01

    The data analysis programme of the Australian Consortium for Interferometric Gravitational Astronomy (ACIGA) was set up in 1998 by Scott to complement the then existing ACIGA programmes working on suspension systems, lasers and optics and detector configurations. The ACIGA data analysis programme continues to contribute significantly in the field; we present an overview of our activities

  14. Computerized ECT data analysis system

    International Nuclear Information System (INIS)

    Miyake, Y.; Fukui, S.; Iwahashi, Y.; Matsumoto, M.; Koyama, K.

    1988-01-01

    For the analytical method of the eddy current testing (ECT) of steam generator tubes in nuclear power plants, the authors have developed the computerized ECT data analysis system using a large-scale computer with a high-resolution color graphic display. This system can store acquired ECT data up to 15 steam generators, and ECT data can be analyzed immediately on the monitor in dialogue communication with a computer. Analyzed results of ECT data are stored and registered in the data base. This system enables an analyst to perform sorting and collecting of data under various conditions and obtain the results automatically, and also to make a plan of tube repair works. This system has completed the test run, and has been used for data analysis at the annual inspection of domestic plants. This paper describes an outline, features and examples of the computerized eddy current data analysis system for steam generator tubes in PWR nuclear power plants

  15. Excel data analysis for dummies

    CERN Document Server

    Nelson, Stephen L

    2014-01-01

    Harness the power of Excel to discover what your numbers are hiding Excel Data Analysis For Dummies, 2nd Edition is the ultimate guide to getting the most out of your data. Veteran Dummies author Stephen L. Nelson guides you through the basic and not-so-basic features of Excel to help you discover the gems hidden in your rough data. From input, to analysis, to visualization, the book walks you through the steps that lead to superior data analysis. Excel is the number-one spreadsheet application, with ever-expanding capabilities. If you're only using it to balance the books, you're missing out

  16. Correspondence analysis of longitudinal data

    NARCIS (Netherlands)

    Van der Heijden, P.G.M.|info:eu-repo/dai/nl/073087998

    2005-01-01

    Correspondence analysis is an exploratory tool for the analysis of associations between categorical variables, the results of which may be displayed graphically. For longitudinal data with two time points, an analysis of the transition matrix (showing the relative frequencies for pairs of

  17. Virtual data in CMS analysis

    International Nuclear Information System (INIS)

    Arbree, A.

    2003-01-01

    The use of virtual data for enhancing the collaboration between large groups of scientists is explored in several ways: by defining ''virtual'' parameter spaces which can be searched and shared in an organized way by a collaboration of scientists in the course of their analysis; by providing a mechanism to log the provenance of results and the ability to trace them back to the various stages in the analysis of real or simulated data; by creating ''check points'' in the course of an analysis to permit collaborators to explore their own analysis branches by refining selections, improving the signal to background ratio, varying the estimation of parameters, etc.; by facilitating the audit of an analysis and the reproduction of its results by a different group, or in a peer review context. We describe a prototype for the analysis of data from the CMS experiment based on the virtual data system Chimera and the object-oriented data analysis framework ROOT. The Chimera system is used to chain together several steps in the analysis process including the Monte Carlo generation of data, the simulation of detector response, the reconstruction of physics objects and their subsequent analysis, histogramming and visualization using the ROOT framework

  18. Statistical analysis of environmental data

    International Nuclear Information System (INIS)

    Beauchamp, J.J.; Bowman, K.O.; Miller, F.L. Jr.

    1975-10-01

    This report summarizes the analyses of data obtained by the Radiological Hygiene Branch of the Tennessee Valley Authority from samples taken around the Browns Ferry Nuclear Plant located in Northern Alabama. The data collection was begun in 1968 and a wide variety of types of samples have been gathered on a regular basis. The statistical analysis of environmental data involving very low-levels of radioactivity is discussed. Applications of computer calculations for data processing are described

  19. Functional and shape data analysis

    CERN Document Server

    Srivastava, Anuj

    2016-01-01

    This textbook for courses on function data analysis and shape data analysis describes how to define, compare, and mathematically represent shapes, with a focus on statistical modeling and inference. It is aimed at graduate students in analysis in statistics, engineering, applied mathematics, neuroscience, biology, bioinformatics, and other related areas. The interdisciplinary nature of the broad range of ideas covered—from introductory theory to algorithmic implementations and some statistical case studies—is meant to familiarize graduate students with an array of tools that are relevant in developing computational solutions for shape and related analyses. These tools, gleaned from geometry, algebra, statistics, and computational science, are traditionally scattered across different courses, departments, and disciplines; Functional and Shape Data Analysis offers a unified, comprehensive solution by integrating the registration problem into shape analysis, better preparing graduate students for handling fu...

  20. Exact analysis of discrete data

    CERN Document Server

    Hirji, Karim F

    2005-01-01

    Researchers in fields ranging from biology and medicine to the social sciences, law, and economics regularly encounter variables that are discrete or categorical in nature. While there is no dearth of books on the analysis and interpretation of such data, these generally focus on large sample methods. When sample sizes are not large or the data are otherwise sparse, exact methods--methods not based on asymptotic theory--are more accurate and therefore preferable.This book introduces the statistical theory, analysis methods, and computation techniques for exact analysis of discrete data. After reviewing the relevant discrete distributions, the author develops the exact methods from the ground up in a conceptually integrated manner. The topics covered range from univariate discrete data analysis, a single and several 2 x 2 tables, a single and several 2 x K tables, incidence density and inverse sampling designs, unmatched and matched case -control studies, paired binary and trinomial response models, and Markov...

  1. Collective Analysis of Qualitative Data

    DEFF Research Database (Denmark)

    Simonsen, Jesper; Friberg, Karin

    2014-01-01

    What. Many students and practitioners do not know how to systematically process qualitative data once it is gathered—at least not as a collective effort. This chapter presents two workshop techniques, affinity diagramming and diagnostic mapping, that support collective analysis of large amounts...... of qualitative data. Affinity diagramming is used to make collective analysis and interpretations of qualitative data to identify core problems that need to be addressed in the design process. Diagnostic mapping supports collective interpretation and description of these problems and how to intervene in them. We....... In particular, collective analysis can be used to identify, understand, and act on complex design problems that emerge, for example, after the introduction of new tech- nologies. Such problems might be hard to clarify, and the basis for the analysis often involves large amounts of unstructured qualitative data...

  2. VESUVIO Data Analysis Goes MANTID

    International Nuclear Information System (INIS)

    Jackson, S; Krzystyniak, M; Seel, A G; Gigg, M; Richards, S E; Fernandez-Alonso, F

    2014-01-01

    This paper describes ongoing efforts to implement the reduction and analysis of neutron Compton scattering data within the MANTID framework. Recently, extensive work has been carried out to integrate the bespoke data reduction and analysis routines written for VESUVIO with the MANTID framework. While the programs described in this document are designed to replicate the functionality of the Fortran and Genie routines already in use, most of them have been written from scratch and are not based on the original code base

  3. VESUVIO Data Analysis Goes MANTID

    Science.gov (United States)

    Jackson, S.; Krzystyniak, M.; Seel, A. G.; Gigg, M.; Richards, S. E.; Fernandez-Alonso, F.

    2014-12-01

    This paper describes ongoing efforts to implement the reduction and analysis of neutron Compton scattering data within the MANTID framework. Recently, extensive work has been carried out to integrate the bespoke data reduction and analysis routines written for VESUVIO with the MANTID framework. While the programs described in this document are designed to replicate the functionality of the Fortran and Genie routines already in use, most of them have been written from scratch and are not based on the original code base.

  4. Virtual Data in CMS Analysis

    CERN Document Server

    Arbree, A; Bourilkov, D; Cavanaugh, R J; Graham, G; Rodríguez, J; Wilde, M; Zhao, Y

    2003-01-01

    The use of virtual data for enhancing the collaboration between large groups of scientists is explored in several ways: - by defining ``virtual'' parameter spaces which can be searched and shared in an organized way by a collaboration of scientists in the course of their analysis - by providing a mechanism to log the provenance of results and the ability to trace them back to the various stages in the analysis of real or simulated data - by creating ``check points'' in the course of an analysis to permit collaborators to explore their own analysis branches by refining selections, improving the signal to background ratio, varying the estimation of parameters, etc. - by facilitating the audit of an analysis and the reproduction of its results by a different group, or in a peer review context. We describe a prototype for the analysis of data from the CMS experiment based on the virtual data system Chimera and the object-oriented data analysis framework ROOT. The Chimera system is used to chain together several s...

  5. Factor analysis of multivariate data

    Digital Repository Service at National Institute of Oceanography (India)

    Fernandes, A.A.; Mahadevan, R.

    A brief introduction to factor analysis is presented. A FORTRAN program, which can perform the Q-mode and R-mode factor analysis and the singular value decomposition of a given data matrix is presented in Appendix B. This computer program, uses...

  6. Analysis of irradiation disordering data

    Energy Technology Data Exchange (ETDEWEB)

    Schwartz, D L [Jet Propulsion Lab., Pasadena, CA (USA); Schwartz, D M

    1978-08-01

    The analysis of irradiation disordering data in ordered Ni/sub 3/Mn is discussed. An analytical expression relating observed irradiation induced magnetic changes in this material to the number of alternating site <110> replacements is derived. This expression is then employed to analyze previous experimental results. This analysis gives results which appear to be consistent with a previous Monte Carlo data analysis and indicates that the expected number of alternating site <110> replacements is 66.4 per 450 eV recoil.

  7. European Conference on Data Analysis

    CERN Document Server

    Krolak-Schwerdt, Sabine; Böhmer, Matthias; Data Science, Learning by Latent Structures, and Knowledge Discovery; ECDA 2013

    2015-01-01

    This volume comprises papers dedicated to data science and the extraction of knowledge from many types of data: structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; applications of advanced methods in specific domains of practice. The contributions offer interesting applications to various disciplines such as psychology, biology, medical and health sciences; economics, marketing, banking, and finance; engineering; geography and geology;  archeology, sociology, educational sciences, linguistics, and musicology; library science. The book contains the selected and peer-reviewed papers presented during the European Conference on Data Analysis (ECDA 2013) which was jointly held by the German Classification Society (GfKl) and the French-speaking Classification Society (SFC) in July 2013 at the University of Luxembourg.

  8. Analysis of Ordinal Categorical Data

    CERN Document Server

    Agresti, Alan

    2012-01-01

    Statistical science's first coordinated manual of methods for analyzing ordered categorical data, now fully revised and updated, continues to present applications and case studies in fields as diverse as sociology, public health, ecology, marketing, and pharmacy. Analysis of Ordinal Categorical Data, Second Edition provides an introduction to basic descriptive and inferential methods for categorical data, giving thorough coverage of new developments and recent methods. Special emphasis is placed on interpretation and application of methods including an integrated comparison of the available st

  9. Wavelets in functional data analysis

    CERN Document Server

    Morettin, Pedro A; Vidakovic, Brani

    2017-01-01

    Wavelet-based procedures are key in many areas of statistics, applied mathematics, engineering, and science. This book presents wavelets in functional data analysis, offering a glimpse of problems in which they can be applied, including tumor analysis, functional magnetic resonance and meteorological data. Starting with the Haar wavelet, the authors explore myriad families of wavelets and how they can be used. High-dimensional data visualization (using Andrews' plots), wavelet shrinkage (a simple, yet powerful, procedure for nonparametric models) and a selection of estimation and testing techniques (including a discussion on Stein’s Paradox) make this a highly valuable resource for graduate students and experienced researchers alike.

  10. Analysis of Hydrologic Properties Data

    International Nuclear Information System (INIS)

    Liu, H.H.; Ahlers, C.F.

    2001-01-01

    The purpose of this Analysis/Model Report (AMR) is to describe the methods used to determine hydrologic properties based on the available field data from the unsaturated zone at Yucca Mountain, Nevada. This is in accordance with the AMR Development Plan (DP) for U0090 Analysis of Hydrologic Properties Data (CRWMS M and O 1999c). Fracture and matrix properties are developed by compiling and analyzing available survey data from the Exploratory Studies Facility (ESF), Cross Drift of Enhanced Characterization of Repository Block (ECRB), and/or boreholes; air injection testing data from surface boreholes and from boreholes in ESF; in-situ measurements of water potential; and data from laboratory testing of core samples

  11. On vehicular traffic data analysis

    Energy Technology Data Exchange (ETDEWEB)

    Brics, Martins; Mahnke, Reinhard [Institute of Physics, Rostock University (Germany)

    2011-07-01

    This contribution consists of analysis of empirical vehicular traffic flow data. The main focus lies on the Next Generation Simulation (NGSIM) data. The first findings show that there are artificial structures within the data due to errors of monitoring as well as smoothing position measurement data. As a result speed data show discretisation in 5 feet per second. The aim of this investigation is to construct microscopic traffic flow models which are in agreement to the analysed empirical data. The ongoing work follows the subject of research summarized by Christof Liebe in his PhD thesis entitled ''Physics of traffic flow: Empirical data and dynamical models'' (Rostock, 2010).

  12. Guide on reflectivity data analysis

    International Nuclear Information System (INIS)

    Lee, Jeong Soo; Ku, Ja Seung; Seong, Baek Seok; Lee, Chang Hee; Hong, Kwang Pyo; Choi, Byung Hoon

    2004-09-01

    This report contains reduction and fitting process of neutron reflectivity data by REFLRED and REFLFIT in NIST. Because the detail of data reduction like BKG, footprint and data normalization was described, it will be useful to the user who has no experience in this field. Also, reflectivity and BKG of d-PS thin film were measured by HANARO neutron reflectometer. From these, the structure of d-PS thin film was analyzed with REFLRED and REFLFIT. Because the structure of thin film such as thickness, roughness and SLD was attained in the work, the possibility of data analysis with REFLRED and REFLFIT was certified

  13. Project MOHAVE data analysis plan

    International Nuclear Information System (INIS)

    Watson, J.G.; Green, M.; Hoffer, T.E.; Lawson, D.R.; Pitchford, M.; Eatough, D.J.; Farber, R.J.; Malm, W.C.; McDade, C.E.

    1993-01-01

    Project MOHAVE is intended to develop ambient and source emissions data for use with source models, receptor models, and data analysis methods in order to explain the nature and causes of visibility degradation in the Grand Canyon. Approximately 50% of the modeling and data analysis effort will be directed toward understanding the contributions from the Mohave Power Project to haze in the Grand Canyon and other nearby Class areas; the remaining resources will be used to understand the contribution from other sources. The major goals of Project MOHAVE and data analysis are: to evaluate the measurement for applicability to modeling and data analysis activities; to describe the visibility, air quality and meteorology during the field study period and to determine the degree to which these measurements represent typical visibility events at the Grand Canyon; to further develop conceptual models of physical and chemical processes which affect visibility impairment at the Grand Canyon; to estimate the contributions from different emission sources to visibility impairment at the Grand Canyon, and to quantitatively evaluate the uncertainties of those estimates; to reconcile different scientific interpretations of the same data and to present this reconciliation to decision-makers. Several different approaches will be applied. Each approach will involve explicit examination of measurement uncertainties, compliance with implicit and explicit assumptions, and representativeness of the measurements. Scientific disagreements will be sought, expressed, explained, quantified, and presented. Data which can be used to verify methods will be withheld for independent evaluation of the validity of those methods. All assumptions will be stated and evaluated against reality. Data analysis results not supporting hypotheses will be presented with those results supporting the hypotheses. Uncertainty statements will be quantitative and consistent with decision-making needs

  14. AGR-1 Thermocouple Data Analysis

    International Nuclear Information System (INIS)

    Einerson, Jeff

    2012-01-01

    This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R and D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods to further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of

  15. AGR-1 Thermocouple Data Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Jeff Einerson

    2012-05-01

    This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods to further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of

  16. Data fusion qualitative sensitivity analysis

    International Nuclear Information System (INIS)

    Clayton, E.A.; Lewis, R.E.

    1995-09-01

    Pacific Northwest Laboratory was tasked with testing, debugging, and refining the Hanford Site data fusion workstation (DFW), with the assistance of Coleman Research Corporation (CRC), before delivering the DFW to the environmental restoration client at the Hanford Site. Data fusion is the mathematical combination (or fusion) of disparate data sets into a single interpretation. The data fusion software used in this study was developed by CRC. The data fusion software developed by CRC was initially demonstrated on a data set collected at the Hanford Site where three types of data were combined. These data were (1) seismic reflection, (2) seismic refraction, and (3) depth to geologic horizons. The fused results included a contour map of the top of a low-permeability horizon. This report discusses the results of a sensitivity analysis of data fusion software to variations in its input parameters. The data fusion software developed by CRC has a large number of input parameters that can be varied by the user and that influence the results of data fusion. Many of these parameters are defined as part of the earth model. The earth model is a series of 3-dimensional polynomials with horizontal spatial coordinates as the independent variables and either subsurface layer depth or values of various properties within these layers (e.g., compression wave velocity, resistivity) as the dependent variables

  17. R data analysis without programming

    CERN Document Server

    Gerbing, David W

    2013-01-01

    This book prepares readers to analyze data and interpret statistical results using R more quickly than other texts. R is a challenging program to learn because code must be created to get started. To alleviate that challenge, Professor Gerbing developed lessR. LessR extensions remove the need to program. By introducing R through less R, readers learn how to organize data for analysis, read the data into R, and produce output without performing numerous functions and programming exercises first. With lessR, readers can select the necessary procedure and change the relevant variables without pro

  18. Data analysis facility at LAMPF

    International Nuclear Information System (INIS)

    Perry, D.G.; Amann, J.F.; Butler, H.S.; Hoffman, C.J.; Mischke, R.E.; Shera, E.B.; Thiessen, H.A.

    1977-11-01

    This report documents the discussions and conclusions of a study held in July 1977 to develop the requirements for a data analysis facility to support the experimental program in medium-energy physics at the Clinton P. Anderson Meson Physics Facility (LAMPF). 2 tables

  19. Handbook on data envelopment analysis

    CERN Document Server

    Cooper, William W; Zhu, Joe

    2011-01-01

    Focusing on extensively used Data Envelopment Analysis topics, this volume aims to both describe the state of the field and extend the frontier of DEA research. New chapters include DEA models for DMUs, network DEA, models for supply chain operations and applications, and new developments.

  20. Lectures on categorical data analysis

    CERN Document Server

    Rudas, Tamás

    2018-01-01

    This book offers a relatively self-contained presentation of the fundamental results in categorical data analysis, which plays a central role among the statistical techniques applied in the social, political and behavioral sciences, as well as in marketing and medical and biological research. The methods applied are mainly aimed at understanding the structure of associations among variables and the effects of other variables on these interactions. A great advantage of studying categorical data analysis is that many concepts in statistics become transparent when discussed in a categorical data context, and, in many places, the book takes this opportunity to comment on general principles and methods in statistics, addressing not only the “how” but also the “why.” Assuming minimal background in calculus, linear algebra, probability theory and statistics, the book is designed to be used in upper-undergraduate and graduate-level courses in the field and in more general statistical methodology courses, as w...

  1. CMS Data Analysis School Model

    CERN Document Server

    Malik, Sudhir; Cavanaugh, R; Bloom, K; Chan, Kai-Feng; D'Hondt, J; Klima, B; Narain, M; Palla, F; Rolandi, G; Schörner-Sadenius, T

    2014-01-01

    To impart hands-on training in physics analysis, CMS experiment initiated the  concept of CMS Data Analysis School (CMSDAS). It was born three years ago at the LPC (LHC Physics Center), Fermilab and is based on earlier workshops held at the LPC and CLEO Experiment. As CMS transitioned from construction to the data taking mode, the nature of earlier training also evolved to include more of analysis tools, software tutorials and physics analysis. This effort epitomized as CMSDAS has proven to be a key for the new and young physicists to jump start and contribute to the physics goals of CMS by looking for new physics with the collision data. With over 400 physicists trained in six CMSDAS around the globe , CMS is trying to  engage the collaboration discovery potential and maximize the physics output. As a bigger goal, CMS is striving to nurture and increase engagement of the myriad talents of CMS, in the development of physics, service, upgrade, education of those new to CMS and the caree...

  2. Challenges of Big Data Analysis.

    Science.gov (United States)

    Fan, Jianqing; Han, Fang; Liu, Han

    2014-06-01

    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

  3. Astronomical Image and Data Analysis

    CERN Document Server

    Starck, J.-L

    2006-01-01

    With information and scale as central themes, this comprehensive survey explains how to handle real problems in astronomical data analysis using a modern arsenal of powerful techniques. It treats those innovative methods of image, signal, and data processing that are proving to be both effective and widely relevant. The authors are leaders in this rapidly developing field and draw upon decades of experience. They have been playing leading roles in international projects such as the Virtual Observatory and the Grid. The book addresses not only students and professional astronomers and astrophysicists, but also serious amateur astronomers and specialists in earth observation, medical imaging, and data mining. The coverage includes chapters or appendices on: detection and filtering; image compression; multichannel, multiscale, and catalog data analytical methods; wavelets transforms, Picard iteration, and software tools. This second edition of Starck and Murtagh's highly appreciated reference again deals with to...

  4. GET electronics samples data analysis

    International Nuclear Information System (INIS)

    Giovinazzo, J.; Goigoux, T.; Anvar, S.; Baron, P.; Blank, B.; Delagnes, E.; Grinyer, G.F.; Pancin, J.; Pedroza, J.L.; Pibernat, J.; Pollacco, E.; Rebii, A.

    2016-01-01

    The General Electronics for TPCs (GET) has been developed to equip a generation of time projection chamber detectors for nuclear physics, and may also be used for a wider range of detector types. The goal of this paper is to propose first analysis procedures to be applied on raw data samples from the GET system, in order to correct for systematic effects observed on test measurements. We also present a method to estimate the response function of the GET system channels. The response function is required in analysis where the input signal needs to be reconstructed, in terms of time distribution, from the registered output samples.

  5. Analysis of Runway Incursion Data

    Science.gov (United States)

    Green, Lawrence L.

    2013-01-01

    A statistical analysis of runway incursion (RI) events was conducted to ascertain relevance to the top ten challenges of the National Aeronautics and Space Administration Aviation Safety Program (AvSP). The information contained in the RI database was found to contain data that may be relevant to several of the AvSP top ten challenges. When combined with other data from the FAA documenting air traffic volume from calendar year 2000 through 2011, the structure of a predictive model emerges that can be used to forecast the frequency of RI events at various airports for various classes of aircraft and under various environmental conditions.

  6. Essentials of multivariate data analysis

    CERN Document Server

    Spencer, Neil H

    2013-01-01

    ""… this text provides an overview at an introductory level of several methods in multivariate data analysis. It contains in-depth examples from one data set woven throughout the text, and a free [Excel] Add-In to perform the analyses in Excel, with step-by-step instructions provided for each technique. … could be used as a text (possibly supplemental) for courses in other fields where researchers wish to apply these methods without delving too deeply into the underlying statistics.""-The American Statistician, February 2015

  7. Visualizing data for environmental analysis

    Energy Technology Data Exchange (ETDEWEB)

    Benson, J.

    1997-04-01

    The Environmental Restoration Project at Los Alamos National Laboratory (LANL) has over 11,000 sampling locations in a 44 square mile area. The sample analyses contain raw analytical chemistry values for over 2,300 analytes and compounds used to define and remediate contaminated areas at LANL. The data consist of 2.5 million records in an oracle database. Maps are often used to visualize the data. Problems arise when a client specifies a particular kind of map without fully understanding the limitations of the data or the map. The ability of maps to convey information is dependent on many factors, though all maps are data dependent. The quantity, spatial distribution, and numerical range of the data can limit use with certain kinds of maps. To address these issues and educate the clients, several types of statistical maps (e.g., choropleth, isarithm, and graduated symbol such as bubble and spike) used for environmental analysis were chosen to show the advantages, disadvantages, and data limitations of each. By examining both the complexity of the analytical data and the limitations of the map type, it is possible to consider how reality has been transformed through the map, and if that transformation accurately conveys the information present.

  8. Visualizing data for environmental analysis

    International Nuclear Information System (INIS)

    Benson, J.

    1997-01-01

    The Environmental Restoration Project at Los Alamos National Laboratory (LANL) has over 11,000 sampling locations in a 44 square mile area. The sample analyses contain raw analytical chemistry values for over 2,300 analytes and compounds used to define and remediate contaminated areas at LANL. The data consist of 2.5 million records in an oracle database. Maps are often used to visualize the data. Problems arise when a client specifies a particular kind of map without fully understanding the limitations of the data or the map. The ability of maps to convey information is dependent on many factors, though all maps are data dependent. The quantity, spatial distribution, and numerical range of the data can limit use with certain kinds of maps. To address these issues and educate the clients, several types of statistical maps (e.g., choropleth, isarithm, and graduated symbol such as bubble and spike) used for environmental analysis were chosen to show the advantages, disadvantages, and data limitations of each. By examining both the complexity of the analytical data and the limitations of the map type, it is possible to consider how reality has been transformed through the map, and if that transformation accurately conveys the information present

  9. Programs for nuclear data analysis

    International Nuclear Information System (INIS)

    Bell, R.A.I.

    1975-01-01

    The following report details a number of programs and subroutines which are useful for analysis of data from nuclear physics experiments. Most of them are available from pool pack 005 on the IBM1800 computer. All of these programs are stored there as core loads, and the subroutines and functions in relocatable format. The nature and location of other programs are specified as appropriate. (author)

  10. Intermittency analysis of correlated data

    International Nuclear Information System (INIS)

    Wosiek, B.

    1992-01-01

    We describe the method of the analysis of the dependence of the factorial moments on the bin size in which the correlations between the moments computed for different bin sizes are taken into account. For large multiplicity nucleus-nucleus data inclusion of the correlations does not change the values of the slope parameter, but gives errors significantly reduced as compared to the case of fits with no correlations. (author)

  11. TFTR Experimental Data Analysis Collaboration

    International Nuclear Information System (INIS)

    Callen, J.D.

    1993-01-01

    The research performed under the second year of this three-year grant has concentrated on a few key TFTR experimental data analysis issues: MHD mode identification and effects on supershots; identification of new MHD modes; MHD mode theory-experiment comparisons; local electron heat transport inferred from impurity-induced cool pulses; and some other topics. Progress in these areas and activities undertaken in conjunction with this grant are summarized briefly in this report

  12. Distributed Data Analysis in ATLAS

    CERN Document Server

    Nilsson, P; The ATLAS collaboration

    2012-01-01

    Data analysis using grid resources is one of the fundamental challenges to be addressed before the start of LHC data taking. The ATLAS detector will produce petabytes of data per year, and roughly one thousand users will need to run physics analyses on this data. Appropriate user interfaces and helper applications have been made available to ensure that the grid resources can be used without requiring expertise in grid technology. These tools enlarge the number of grid users from a few production administrators to potentially all participating physicists. ATLAS makes use of three grid infrastructures for the distributed analysis: the EGEE sites, the Open Science Grid, and NorduGrid. These grids are managed by the gLite workload management system, the PanDA workload management system, and ARC middleware; many sites can be accessed via both the gLite WMS and PanDA. Users can choose between two front-end tools to access the distributed resources. Ganga is a tool co-developed with LHCb to provide a common interfa...

  13. Advances in Moessbauer data analysis

    International Nuclear Information System (INIS)

    Souza, Paulo A. de

    1998-01-01

    The whole Moessbauer community generates a huge amount of data in several fields of human knowledge since the first publication of Rudolf Moessbauer. Interlaboratory measurements of the same substance may result in minor differences in the Moessbauer Parameters (MP) of isomer shift, quadrupole splitting and internal magnetic field. Therefore, a conventional data bank of published MP will be of limited help in identification of substances. Data bank search for exact information became incapable to differentiate the values of Moessbauer parameters within the experimental errors (e.g., IS = 0.22 mm/s from IS = 0.23 mm/s), but physically both values may be considered the same. An artificial neural network (ANN) is able to identify a substance and its crystalline structure from measured MP, and its slight variations do not represent an obstacle for the ANN identification. A barrier to the popularization of Moessbauer spectroscopy as an analytical technique is the absence of a full automated equipment, since the analysis of a Moessbauer spectrum normally is time-consuming and requires a specialist. In this work, the fitting process of a Moessbauer spectrum was completely automated through the use of genetic algorithms and fuzzy logic. Both software and hardware systems were implemented turning out to be a fully automated Moessbauer data analysis system. The developed system will be presented

  14. PATTER, Pattern Recognition Data Analysis

    International Nuclear Information System (INIS)

    Cox, L.C. Jr.; Bender, C.F.

    1986-01-01

    1 - Description of program or function: PATTER is an interactive program with extensive facilities for modeling analytical processes and solving complex data analysis problems using statistical methods, spectral analysis, and pattern recognition techniques. PATTER addresses the type of problem generally stated as follows: given a set of objects and a list of measurements made on these objects, is it possible to find or predict a property of the objects which is not directly measurable but is known to define some unknown relationship? When employed intelligently, PATTER will act upon a data set in such a way it becomes apparent if useful information, beyond that already discerned, is contained in the data. 2 - Method of solution: In order to solve the general problem, PATTER contains preprocessing techniques to produce new variables that are related to the values of the measurements which may reduce the number of variables and/or reveal useful information about the 'obscure' property; display techniques to represent the variable space in some way that can be easily projected onto a two- or three-dimensional plot for human observation to see if any significant clustering of points occurs; and learning techniques based on both unsupervised and supervised methods, to extract as much information from the data as possible so that the optimum solution can be found

  15. Plasma data analysis using statistical analysis system

    International Nuclear Information System (INIS)

    Yoshida, Z.; Iwata, Y.; Fukuda, Y.; Inoue, N.

    1987-01-01

    Multivariate factor analysis has been applied to a plasma data base of REPUTE-1. The characteristics of the reverse field pinch plasma in REPUTE-1 are shown to be explained by four independent parameters which are described in the report. The well known scaling laws F/sub chi/ proportional to I/sub p/, T/sub e/ proportional to I/sub p/, and tau/sub E/ proportional to N/sub e/ are also confirmed. 4 refs., 8 figs., 1 tab

  16. Structural Dynamics and Data Analysis

    Science.gov (United States)

    Luthman, Briana L.

    2013-01-01

    This project consists of two parts, the first will be the post-flight analysis of data from a Delta IV launch vehicle, and the second will be a Finite Element Analysis of a CubeSat. Shock and vibration data was collected on WGS-5 (Wideband Global SATCOM- 5) which was launched on a Delta IV launch vehicle. Using CAM (CAlculation with Matrices) software, the data is to be plotted into Time History, Shock Response Spectrum, and SPL (Sound Pressure Level) curves. In this format the data is to be reviewed and compared to flight instrumentation data from previous flights of the same launch vehicle. This is done to ensure the current mission environments, such as shock, random vibration, and acoustics, are not out of family with existing flight experience. In family means the peaks on the SRS curve for WGS-5 are similar to the peaks from the previous flights and there are no major outliers. The curves from the data will then be compiled into a useful format so that is can be peer reviewed then presented before an engineering review board if required. Also, the reviewed data will be uploaded to the Engineering Review Board Information System (ERBIS) to archive. The second part of this project is conducting Finite Element Analysis of a CubeSat. In 2010, Merritt Island High School partnered with NASA to design, build and launch a CubeSat. The team is now called StangSat in honor of their mascot, the mustang. Over the past few years, the StangSat team has built a satellite and has now been manifested for flight on a SpaceX Falcon 9 launch in 2014. To prepare for the final launch, a test flight was conducted in Mojave, California. StangSat was launched on a Prospector 18D, a high altitude rocket made by Garvey Spacecraft Corporation, along with their sister satellite CP9 built by California Polytechnic University. However, StangSat was damaged during an off nominal landing and this project will give beneficial insights into what loads the CubeSat experienced during the crash

  17. LISA Pathfinder instrument data analysis

    Science.gov (United States)

    Guzman, Felipe

    LISA Pathfinder (LPF) is an ESA-launched demonstration mission of key technologies required for the joint NASA-ESA gravitational wave observatory in space, LISA. As part of the LPF interferometry investigations, analytic models of noise sources and corresponding noise subtrac-tion techniques have been developed to correct for effects like the coupling of test mass jitter into displacement readout, and fluctuations of the laser frequency or optical pathlength difference. Ground testing of pre-flight hardware of the Optical Metrology Subsystem is currently ongoing at the Albert Einstein Institute Hannover. In collaboration with NASA Goddard Space Flight Center, the LPF mission data analysis tool LTPDA is being used to analyze the data product of these tests. Furthermore, the noise subtraction techniques and in-flight experiment runs for noise characterization are being defined as part of the mission experiment master plan. We will present the data analysis outcome of pre-flight hardware ground tests and possible noise subtraction strategies for in-flight instrument operations.

  18. WFIRST: Microlensing Analysis Data Challenge

    Science.gov (United States)

    Street, Rachel; WFIRST Microlensing Science Investigation Team

    2018-01-01

    WFIRST will produce thousands of high cadence, high photometric precision lightcurves of microlensing events, from which a wealth of planetary and stellar systems will be discovered. However, the analysis of such lightcurves has historically been very time consuming and expensive in both labor and computing facilities. This poses a potential bottleneck to deriving the full science potential of the WFIRST mission. To address this problem, the WFIRST Microlensing Science Investigation Team designing a series of data challenges to stimulate research to address outstanding problems of microlensing analysis. These range from the classification and modeling of triple lens events to methods to efficiently yet thoroughly search a high-dimensional parameter space for the best fitting models.

  19. Data Analysis Methods for Paleogenomics

    DEFF Research Database (Denmark)

    Avila Arcos, Maria del Carmen

    (Danmarks Grundforskningfond) 'Centre of Excellence in GeoGenetics' grant, with additional funding provided by the Danish Council for Independent Research 'Sapere Aude' programme. The thesis comprises five chapters, all of which represent different projects that involved the analysis of massive amounts......, thanks to the introduction of NGS and the implementation of data analysis methods specific for each project. Chapters 1 to 3 have been published in peer-reviewed journals and Chapter 4 is currently in review. Chapter 5 consists of a manuscript describing initial results of an ongoing research project......The work presented in this thesis is the result of research carried out during a three-year PhD at the Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, under supervision of Professor Tom Gilbert. The PhD was funded by the Danish National Research Foundation...

  20. Probabilistic reasoning in data analysis.

    Science.gov (United States)

    Sirovich, Lawrence

    2011-09-20

    This Teaching Resource provides lecture notes, slides, and a student assignment for a lecture on probabilistic reasoning in the analysis of biological data. General probabilistic frameworks are introduced, and a number of standard probability distributions are described using simple intuitive ideas. Particular attention is focused on random arrivals that are independent of prior history (Markovian events), with an emphasis on waiting times, Poisson processes, and Poisson probability distributions. The use of these various probability distributions is applied to biomedical problems, including several classic experimental studies.

  1. Analysis of DCA experimental data

    International Nuclear Information System (INIS)

    Min, B. J.; Kim, S. Y.; Ryu, S. J.; Seok, H. C.

    2000-01-01

    The lattice characteristics of DCA are calculated with WIMS-ATR code to validate WIMS-AECL code for the lattice analysis of CANDU core by using experimental data of DCA at JNC. Analytical studies of some critical experiments had been performed to analyze the effects of fuel composition. Different items of reactor physics such as local power peaking factor (LPF), effective multiplication factor (Keff) and coolant void reactivity were calculated for two coolant void fractions (0% and 100%). LPFs calculated by WIMS-ATR code are in close agreement with the experimental results. LPFs calculated by WIMS-AECL code with WINFRITH and ENDF/B-V libraries have similar values for both libraries but the differences between experimental data and calculated results by WIMS-AECL code are larger than those of WIMS-ATR code. The maximum difference between the values calculated by WIMS-ATR and experimental values of LPFs are within 1.3%. The coupled code systems WIMS-ATR and CITATION used in this analysis predict Keff within 1% ΔK and coolant void reactivity within 4 % ΔK/K in all cases. The coolant void reactivity of uranium fuel is found to be positive. To validate WIMS-AECL code, the core characteristics of DCA shall be calculated by WIMS-AECL and CITATION codes in the future

  2. On Survey Data Analysis in Corporate Finance

    OpenAIRE

    Serita, Toshio

    2008-01-01

    Recently, survey data analysis has emerged as a new method for testing hypotheses andfor clarifying the relative importance of different factors in corporate finance decisions. This paper investigates the advantages and drawbacks of survey data analysis, methodology of survey data analysis such as questionnaire design, and analytical methods for survey data, incomparison with traditional large sample analysis. We show that survey data analysis does not replace traditional large sample analysi...

  3. Analysis of Hydrologic Properties Data

    Energy Technology Data Exchange (ETDEWEB)

    H. H. Liu

    2003-04-03

    This Model Report describes the methods used to determine hydrologic properties based on the available field data from the unsaturated zone (UZ) at Yucca Mountain, Nevada, and documents validation of the active fracture model (AFM). This work was planned in ''Technical Work Plan (TWP) for: Performance Assessment Unsaturated Zone'' (BSC 2002 [160819], Sections 1.10.2, 1.10.3, and 1.10.8). Fracture and matrix properties are developed by analyzing available survey data from the Exploratory Studies Facility (ESF), Cross Drift for Enhanced Characterization of Repository Block (ECRB), and/or boreholes; air injection testing data from surface boreholes and from boreholes in the ESF; and data from laboratory testing of core samples. The AFM is validated on the basis of experimental observations and theoretical developments. This report is a revision of an Analysis Model Report, under the same title, as a scientific analysis with Document Identifier number ANL-NBS-HS-000002 (BSC 2001 [159725]) that did not document activities to validate the AFM. The principal purpose of this work is to provide representative uncalibrated estimates of fracture and matrix properties for use in the model report ''Calibrated Properties Model'' (BSC 2003 [160240]). The present work also provides fracture geometry properties for generating dual permeability grids as documented in the Scientific Analysis Report, ''Development of Numerical Grids for UZ Flow and Transport Modeling'' (BSC 2003 [160109]). The resulting calibrated property sets and numerical grids from these reports will be used in the Unsaturated Zone Flow and Transport Process Model (UZ Model), and Total System Performance Assessment (TSPA) models. The fracture and matrix properties developed in this Model Report include: (1) Fracture properties (frequency, permeability, van Genuchten a and m parameters, aperture, porosity, and interface area) for each UZ Model layer; (2

  4. Data Decision Analysis: Project Shoal

    Energy Technology Data Exchange (ETDEWEB)

    Forsgren, Frank; Pohll, Greg; Tracy, John

    1999-01-01

    The purpose of this study was to determine the most appropriate field activities in terms of reducing the uncertainty in the groundwater flow and transport model at the Project Shoal area. The data decision analysis relied on well-known tools of statistics and uncertainty analysis. This procedure identified nine parameters that were deemed uncertain. These included effective porosity, hydraulic head, surface recharge, hydraulic conductivity, fracture correlation scale, fracture orientation, dip angle, dissolution rate of radionuclides from the puddle glass, and the retardation coefficient, which describes the sorption characteristics. The parameter uncertainty was described by assigning prior distributions for each of these parameters. Next, the various field activities were identified that would provide additional information on these parameters. Each of the field activities was evaluated by an expert panel to estimate posterior distribution of the parameters assuming a field activity was performed. The posterior distributions describe the ability of the field activity to estimate the true value of the nine parameters. Monte Carlo techniques were used to determine the current uncertainty, the reduction of uncertainty if a single parameter was known with certainty, and the reduction of uncertainty expected from each field activity on the model predictions. The mean breakthrough time to the downgradient land withdrawal boundary and the peak concentration at the control boundary were used to evaluate the uncertainty reduction. The radionuclide 137Cs was used as the reference solute, as its migration is dependent on all of the parameters. The results indicate that the current uncertainty of the model yields a 95 percent confidence interval between 42 and 1,412 years for the mean breakthrough time and an 18 order-of-magnitude range in peak concentration. The uncertainty in effective porosity and recharge dominates the uncertainty in the model predictions, while the

  5. Analysis of event-mode data with Interactive Data Language

    International Nuclear Information System (INIS)

    De Young, P.A.; Hilldore, B.B.; Kiessel, L.M.; Peaslee, G.F.

    2003-01-01

    We have developed an analysis package for event-mode data based on Interactive Data Language (IDL) from Research Systems Inc. This high-level language is high speed, array oriented, object oriented, and has extensive visual (multi-dimensional plotting) and mathematical functions. We have developed a general framework, written in IDL, for the analysis of a variety of experimental data that does not require significant customization for each analysis. Unlike many traditional analysis package, spectra and gates are applied after data are read and are easily changed as analysis proceeds without rereading the data. The events are not sequentially processed into predetermined arrays subject to predetermined gates

  6. Selected topics on data analysis in astronomy

    International Nuclear Information System (INIS)

    Scarsi, L.

    1987-01-01

    The contents of this book are: General Lectures Given at the Erice II Workshop on Data Analysis in Astronomy: Fundamentals in Data Analysis in Astronomy; Computational Techniques; Evolution of Architectures for Data Processing; Hardware for Graphics and Image Display; and Data Analysis Systems

  7. DataSHIELD: taking the analysis to the data, not the data to the analysis.

    Science.gov (United States)

    Gaye, Amadou; Marcon, Yannick; Isaeva, Julia; LaFlamme, Philippe; Turner, Andrew; Jones, Elinor M; Minion, Joel; Boyd, Andrew W; Newby, Christopher J; Nuotio, Marja-Liisa; Wilson, Rebecca; Butters, Oliver; Murtagh, Barnaby; Demir, Ipek; Doiron, Dany; Giepmans, Lisette; Wallace, Susan E; Budin-Ljøsne, Isabelle; Oliver Schmidt, Carsten; Boffetta, Paolo; Boniol, Mathieu; Bota, Maria; Carter, Kim W; deKlerk, Nick; Dibben, Chris; Francis, Richard W; Hiekkalinna, Tero; Hveem, Kristian; Kvaløy, Kirsti; Millar, Sean; Perry, Ivan J; Peters, Annette; Phillips, Catherine M; Popham, Frank; Raab, Gillian; Reischl, Eva; Sheehan, Nuala; Waldenberger, Melanie; Perola, Markus; van den Heuvel, Edwin; Macleod, John; Knoppers, Bartha M; Stolk, Ronald P; Fortier, Isabel; Harris, Jennifer R; Woffenbuttel, Bruce H R; Murtagh, Madeleine J; Ferretti, Vincent; Burton, Paul R

    2014-12-01

    Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises important ethico-legal questions and can be controversial. In the UK this has been highlighted by recent debate and controversy relating to the UK's proposed 'care.data' initiative, and these issues reflect important societal and professional concerns about privacy, confidentiality and intellectual property. DataSHIELD provides a novel technological solution that can circumvent some of the most basic challenges in facilitating the access of researchers and other healthcare professionals to individual-level data. Commands are sent from a central analysis computer (AC) to several data computers (DCs) storing the data to be co-analysed. The data sets are analysed simultaneously but in parallel. The separate parallelized analyses are linked by non-disclosive summary statistics and commands transmitted back and forth between the DCs and the AC. This paper describes the technical implementation of DataSHIELD using a modified R statistical environment linked to an Opal database deployed behind the computer firewall of each DC. Analysis is controlled through a standard R environment at the AC. Based on this Opal/R implementation, DataSHIELD is currently used by the Healthy Obese Project and the Environmental Core Project (BioSHaRE-EU) for the federated analysis of 10 data sets across eight European countries, and this illustrates the opportunities and challenges presented by the DataSHIELD approach. DataSHIELD facilitates important research in settings where: (i) a co-analysis of individual-level data from several studies is scientifically necessary but governance restrictions prohibit the release or sharing of some of the required data, and/or render data access unacceptably slow; (ii) a

  8. Qualitative data analysis: conceptual and practical considerations.

    Science.gov (United States)

    Liamputtong, Pranee

    2009-08-01

    Qualitative inquiry requires that collected data is organised in a meaningful way, and this is referred to as data analysis. Through analytic processes, researchers turn what can be voluminous data into understandable and insightful analysis. This paper sets out the different approaches that qualitative researchers can use to make sense of their data including thematic analysis, narrative analysis, discourse analysis and semiotic analysis and discusses the ways that qualitative researchers can analyse their data. I first discuss salient issues in performing qualitative data analysis, and then proceed to provide some suggestions on different methods of data analysis in qualitative research. Finally, I provide some discussion on the use of computer-assisted data analysis.

  9. Functional Analysis of Metabolomics Data.

    Science.gov (United States)

    Chagoyen, Mónica; López-Ibáñez, Javier; Pazos, Florencio

    2016-01-01

    Metabolomics aims at characterizing the repertory of small chemical compounds in a biological sample. As it becomes more massive and larger sets of compounds are detected, a functional analysis is required to convert these raw lists of compounds into biological knowledge. The most common way of performing such analysis is "annotation enrichment analysis," also used in transcriptomics and proteomics. This approach extracts the annotations overrepresented in the set of chemical compounds arisen in a given experiment. Here, we describe the protocols for performing such analysis as well as for visualizing a set of compounds in different representations of the metabolic networks, in both cases using free accessible web tools.

  10. Clinical trial data analysis using R

    National Research Council Canada - National Science Library

    Chen, Ding-Geng; Peace, Karl E

    2011-01-01

    .... Case studies demonstrate how to select the appropriate clinical trial data. The authors introduce the corresponding biostatistical analysis methods, followed by the step-by-step data analysis using R...

  11. Statistical analysis of medical data using SAS

    CERN Document Server

    Der, Geoff

    2005-01-01

    An Introduction to SASDescribing and Summarizing DataBasic InferenceScatterplots Correlation: Simple Regression and SmoothingAnalysis of Variance and CovarianceMultiple RegressionLogistic RegressionThe Generalized Linear ModelGeneralized Additive ModelsNonlinear Regression ModelsThe Analysis of Longitudinal Data IThe Analysis of Longitudinal Data II: Models for Normal Response VariablesThe Analysis of Longitudinal Data III: Non-Normal ResponseSurvival AnalysisAnalysis Multivariate Date: Principal Components and Cluster AnalysisReferences

  12. Perspectives on spatial data analysis

    CERN Document Server

    Rey, Sergio

    2010-01-01

    This book takes both a retrospective and prospective view of the field of spatial analysis by combining selected reprints of classic articles by Arthur Getis with current observations by leading experts in the field. Four main aspects are highlighted, dealing with spatial analysis, pattern analysis, local statistics as well as illustrative empirical applications. Researchers and students will gain an appreciation of Getis' methodological contributions to spatial analysis and the broad impact of the methods he has helped pioneer on an impressively broad array of disciplines including spatial epidemiology, demography, economics, and ecology. The volume is a compilation of high impact original contributions, as evidenced by citations, and the latest thinking on the field by leading scholars. This makes the book ideal for advanced seminars and courses in spatial analysis as well as a key resource for researchers seeking a comprehensive overview of recent advances and future directions in the field.

  13. International Data & Economic Analysis (IDEA)

    Data.gov (United States)

    US Agency for International Development — International Data UN Food and Agriculture Organization, Food Price Index; IMF, Direction of Trade Statistics; Millennium Challenge Corporation; and World Bank,...

  14. A Probabilistic Analysis of Data Popularity in ATLAS Data Caching

    CERN Document Server

    Titov, M; The ATLAS collaboration; Záruba, G; De, K

    2012-01-01

    Efficient distribution of physics data over ATLAS grid sites is one of the most important tasks for user data processing. ATLAS' initial static data distribution model over-replicated some unpopular data and under-replicated popular data, creating heavy disk space loads while underutilizing some processing resources due to low data availability. Thus, a new data distribution mechanism was implemented, PD2P (PanDA Dynamic Data Placement) within the production and distributed analysis system PanDA that dynamically reacts to user data needs, basing dataset distribution principally on user demand. Data deletion is also demand driven, reducing replica counts for unpopular data. This dynamic model has led to substantial improvements in efficient utilization of storage and processing resources. Based on this experience, in this work we seek to further improve data placement policy by investigating in detail how data popularity is calculated. For this it is necessary to precisely define what data popularity means, wh...

  15. Status of MTP Data Analysis for TCSP

    Science.gov (United States)

    Mahoney, Michael J.

    2006-01-01

    Topics covered include: a) MTP temperature calibration and data analysis; b) Background for interpreting MTP data; c) Large amplitude temperature structure; d) Gravity waves (GWs) in MTP data; and e) Subsidence over hurricanes.

  16. Statistical analysis and data management

    International Nuclear Information System (INIS)

    Anon.

    1981-01-01

    This report provides an overview of the history of the WIPP Biology Program. The recommendations of the American Institute of Biological Sciences (AIBS) for the WIPP biology program are summarized. The data sets available for statistical analyses and problems associated with these data sets are also summarized. Biological studies base maps are presented. A statistical model is presented to evaluate any correlation between climatological data and small mammal captures. No statistically significant relationship between variance in small mammal captures on Dr. Gennaro's 90m x 90m grid and precipitation records from the Duval Potash Mine were found

  17. Accounting and Financial Data Analysis Data Mining Tools

    Directory of Open Access Journals (Sweden)

    Diana Elena Codreanu

    2011-05-01

    Full Text Available Computerized accounting systems in recent years have seen an increase in complexity due to thecompetitive economic environment but with the help of data analysis solutions such as OLAP and DataMining can be a multidimensional data analysis, can detect the fraud and can discover knowledge hidden indata, ensuring such information is useful for decision making within the organization. In the literature thereare many definitions for data mining but all boils down to same idea: the process takes place to extract newinformation from large data collections, information without the aid of data mining tools would be verydifficult to obtain. Information obtained by data mining process has the advantage that only respond to thequestion of what happens but at the same time argue and show why certain things are happening. In this paperwe wish to present advanced techniques for analysis and exploitation of data stored in a multidimensionaldatabase.

  18. Modeling data irregularities and structural complexities in data envelopment analysis

    CERN Document Server

    Zhu, Joe

    2007-01-01

    In a relatively short period of time, Data Envelopment Analysis (DEA) has grown into a powerful quantitative, analytical tool for measuring and evaluating performance. It has been successfully applied to a whole variety of problems in many different contexts worldwide. This book deals with the micro aspects of handling and modeling data issues in modeling DEA problems. DEA's use has grown with its capability of dealing with complex "service industry" and the "public service domain" types of problems that require modeling of both qualitative and quantitative data. This handbook treatment deals with specific data problems including: imprecise or inaccurate data; missing data; qualitative data; outliers; undesirable outputs; quality data; statistical analysis; software and other data aspects of modeling complex DEA problems. In addition, the book will demonstrate how to visualize DEA results when the data is more than 3-dimensional, and how to identify efficiency units quickly and accurately.

  19. 2nd European Conference on Data Analysis

    CERN Document Server

    Wilhelm, Adalbert FX

    2016-01-01

    This book offers a snapshot of the state-of-the-art in classification at the interface between statistics, computer science and application fields. The contributions span a broad spectrum, from theoretical developments to practical applications; they all share a strong computational component. The topics addressed are from the following fields: Statistics and Data Analysis; Machine Learning and Knowledge Discovery; Data Analysis in Marketing; Data Analysis in Finance and Economics; Data Analysis in Medicine and the Life Sciences; Data Analysis in the Social, Behavioural, and Health Care Sciences; Data Analysis in Interdisciplinary Domains; Classification and Subject Indexing in Library and Information Science. The book presents selected papers from the Second European Conference on Data Analysis, held at Jacobs University Bremen in July 2014. This conference unites diverse researchers in the pursuit of a common topic, creating truly unique synergies in the process.

  20. Bayesian Data Analysis (lecture 2)

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    framework but we will also go into more detail and discuss for example the role of the prior. The second part of the lecture will cover further examples and applications that heavily rely on the bayesian approach, as well as some computational tools needed to perform a bayesian analysis.

  1. Bayesian Data Analysis (lecture 1)

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    framework but we will also go into more detail and discuss for example the role of the prior. The second part of the lecture will cover further examples and applications that heavily rely on the bayesian approach, as well as some computational tools needed to perform a bayesian analysis.

  2. Analysis of repeated measures data

    CERN Document Server

    Islam, M Ataharul

    2017-01-01

    This book presents a broad range of statistical techniques to address emerging needs in the field of repeated measures. It also provides a comprehensive overview of extensions of generalized linear models for the bivariate exponential family of distributions, which represent a new development in analysing repeated measures data. The demand for statistical models for correlated outcomes has grown rapidly recently, mainly due to presence of two types of underlying associations: associations between outcomes, and associations between explanatory variables and outcomes. The book systematically addresses key problems arising in the modelling of repeated measures data, bearing in mind those factors that play a major role in estimating the underlying relationships between covariates and outcome variables for correlated outcome data. In addition, it presents new approaches to addressing current challenges in the field of repeated measures and models based on conditional and joint probabilities. Markov models of first...

  3. An Automated Data Analysis Tool for Livestock Market Data

    Science.gov (United States)

    Williams, Galen S.; Raper, Kellie Curry

    2011-01-01

    This article describes an automated data analysis tool that allows Oklahoma Cooperative Extension Service educators to disseminate results in a timely manner. Primary data collected at Oklahoma Quality Beef Network (OQBN) certified calf auctions across the state results in a large amount of data per sale site. Sale summaries for an individual sale…

  4. Expediting Scientific Data Analysis with Reorganization of Data

    Energy Technology Data Exchange (ETDEWEB)

    Byna, Surendra; Wu, Kesheng

    2013-08-19

    Data producers typically optimize the layout of data files to minimize the write time. In most cases, data analysis tasks read these files in access patterns different from the write patterns causing poor read performance. In this paper, we introduce Scientific Data Services (SDS), a framework for bridging the performance gap between writing and reading scientific data. SDS reorganizes data to match the read patterns of analysis tasks and enables transparent data reads from the reorganized data. We implemented a HDF5 Virtual Object Layer (VOL) plugin to redirect the HDF5 dataset read calls to the reorganized data. To demonstrate the effectiveness of SDS, we applied two parallel data organization techniques: a sort-based organization on a plasma physics data and a transpose-based organization on mass spectrometry imaging data. We also extended the HDF5 data access API to allow selection of data based on their values through a query interface, called SDS Query. We evaluated the execution time in accessing various subsets of data through existing HDF5 Read API and SDS Query. We showed that reading the reorganized data using SDS is up to 55X faster than reading the original data.

  5. Automatic analysis of ultrasonic data

    International Nuclear Information System (INIS)

    Horteur, P.; Colin, J.; Benoist, P.; Bonis, M.; Paradis, L.

    1986-10-01

    This paper describes an automatic and self-contained data processing system, transportable on site, able to perform images such as ''A. Scan'', ''B. Scan'', ... to present very quickly the results of the control. It can be used in the case of pressure vessel inspection [fr

  6. A Probabilistic Analysis of Data Popularity in ATLAS Data Caching

    International Nuclear Information System (INIS)

    Titov, M; Záruba, G; De, K; Klimentov, A

    2012-01-01

    One of the most important aspects in any computing distribution system is efficient data replication over storage or computing centers, that guarantees high data availability and low cost for resource utilization. In this paper we propose a data distribution scheme for the production and distributed analysis system PanDA at the ATLAS experiment. Our proposed scheme is based on the investigation of data usage. Thus, the paper is focused on the main concepts of data popularity in the PanDA system and their utilization. Data popularity is represented as the set of parameters that are used to predict the future data state in terms of popularity levels.

  7. Open Data and Data Analysis Preservation Services for LHC Experiments

    CERN Document Server

    Cowton, J; Fokianos, P; Rueda, L; Herterich, P; Kunčar, J; Šimko, T; Smith, T

    2015-01-01

    In this paper we present newly launched services for open data and for long-term preservation and reuse of high-energy-physics data analyses based on the digital library software Invenio. We track the ”data continuum” practices through several progressive data analysis phases up to the final publication. The aim is to capture for subsequent generations all digital assets and associated knowledge inherent in the data analysis process, and to make a subset available rapidly to the public. The ultimate goal of the analysis preservation platform is to capture enough information about the processing steps in order to facilitate reproduction of an analysis even many years after its initial publication, permitting to extend the impact of preserved analyses through future revalidation and recasting services. A related ”open data” service was launched for the benefit of the general public.

  8. Data Analysis for Explosive Firesets

    Energy Technology Data Exchange (ETDEWEB)

    Barks, Thomas A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-07-30

    I analyzed the data from various detonators at different initial voltages to find the RLC values (resistance, inductance, capacitance) of the fireset. The data I was given contained a current and voltage for each time value taken on nanosecond intervals. From this, I was able to make plots of several variables to try and find which if any of variables correlated with a burst or a go. These results will allow us to fully understand what is required to achieve a burst in the bridgewire so that we can know what is safe or what will never cause detonation. We may also be able to predict the outcome of using the same fireset with different detonators or with different sizes or materials of bridgewire.

  9. EBT data acquisition and analysis system

    International Nuclear Information System (INIS)

    Burris, R.D.; Greenwood, D.E.; Stanton, J.S.; Geoffroy, K.A.

    1980-10-01

    This document describes the design and implementation of a data acquisition and analysis system for the EBT fusion experiment. The system includes data acquisition on five computers, automatic transmission of that data to a large, central data base, and a powerful data retrieval system. The system is flexible and easy to use, and it provides a fully documented record of the experiments

  10. A SWOT Analysis of Big Data

    Science.gov (United States)

    Ahmadi, Mohammad; Dileepan, Parthasarati; Wheatley, Kathleen K.

    2016-01-01

    This is the decade of data analytics and big data, but not everyone agrees with the definition of big data. Some researchers see it as the future of data analysis, while others consider it as hype and foresee its demise in the near future. No matter how it is defined, big data for the time being is having its glory moment. The most important…

  11. Data Analysis of Complex Systems

    Science.gov (United States)

    2011-06-01

    Adaptation mechanisms have shown to take advantage of the mean by using it as an expected value and expressing values relative to it. The human retina is...motor control (71) (72) (73) (74). Franklin used an artificial neural network in a manufacturing environment for the production of fluorescent ...fewer times with respect to operator intervention. a. An intervention is marked in the data file if the status changes from manual to auto (or vice

  12. Columbia River Component Data Gap Analysis

    Energy Technology Data Exchange (ETDEWEB)

    L. C. Hulstrom

    2007-10-23

    This Data Gap Analysis report documents the results of a study conducted by Washington Closure Hanford (WCH) to compile and reivew the currently available surface water and sediment data for the Columbia River near and downstream of the Hanford Site. This Data Gap Analysis study was conducted to review the adequacy of the existing surface water and sediment data set from the Columbia River, with specific reference to the use of the data in future site characterization and screening level risk assessments.

  13. Bayesian Nonparametric Longitudinal Data Analysis.

    Science.gov (United States)

    Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen

    2016-01-01

    Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.

  14. Methods for Mediation Analysis with Missing Data

    Science.gov (United States)

    Zhang, Zhiyong; Wang, Lijuan

    2013-01-01

    Despite wide applications of both mediation models and missing data techniques, formal discussion of mediation analysis with missing data is still rare. We introduce and compare four approaches to dealing with missing data in mediation analysis including list wise deletion, pairwise deletion, multiple imputation (MI), and a two-stage maximum…

  15. Teaching Data Analysis with Interactive Visual Narratives

    Science.gov (United States)

    Saundage, Dilal; Cybulski, Jacob L.; Keller, Susan; Dharmasena, Lasitha

    2016-01-01

    Data analysis is a major part of business analytics (BA), which refers to the skills, methods, and technologies that enable managers to make swift, quality decisions based on large amounts of data. BA has become a major component of Information Systems (IS) courses all over the world. The challenge for IS educators is to teach data analysis--the…

  16. Data near processing support for climate data analysis

    Science.gov (United States)

    Kindermann, Stephan; Ehbrecht, Carsten; Hempelmann, Nils

    2016-04-01

    Climate data repositories grow in size exponentially. Scalable data near processing capabilities are required to meet future data analysis requirements and to replace current "data download and process at home" workflows and approaches. On one hand side, these processing capabilities should be accessible via standardized interfaces (e.g. OGC WPS), on the other side a large variety of processing tools, toolboxes and deployment alternatives have to be supported and maintained at the data/processing center. We present a community approach of a modular and flexible system supporting the development, deployment and maintenace of OGC-WPS based web processing services. This approach is organized in an open source github project (called "bird-house") supporting individual processing services ("birds", e.g. climate index calculations, model data ensemble calculations), which rely on basic common infrastructural components (e.g. installation and deployment recipes, analysis code dependencies management). To support easy deployment at data centers as well as home institutes (e.g. for testing and development) the system supports the management of the often very complex package dependency chain of climate data analysis packages as well as docker based packaging and installation. We present a concrete deployment scenario at the German Climate Computing Center (DKRZ). The DKRZ one hand side hosts a multi-petabyte climate archive which is integrated e.g. into the european ENES and worldwide ESGF data infrastructure, and on the other hand hosts an HPC center supporting (model) data production and data analysis. The deployment scenario also includes openstack based data cloud services to support data import and data distribution for bird-house based WPS web processing services. Current challenges for inter-institutionnal deployments of web processing services supporting the european and international climate modeling community as well as the climate impact community are highlighted

  17. Data Analysis and Data Mining: Current Issues in Biomedical Informatics

    Science.gov (United States)

    Bellazzi, Riccardo; Diomidous, Marianna; Sarkar, Indra Neil; Takabayashi, Katsuhiko; Ziegler, Andreas; McCray, Alexa T.

    2011-01-01

    Summary Background Medicine and biomedical sciences have become data-intensive fields, which, at the same time, enable the application of data-driven approaches and require sophisticated data analysis and data mining methods. Biomedical informatics provides a proper interdisciplinary context to integrate data and knowledge when processing available information, with the aim of giving effective decision-making support in clinics and translational research. Objectives To reflect on different perspectives related to the role of data analysis and data mining in biomedical informatics. Methods On the occasion of the 50th year of Methods of Information in Medicine a symposium was organized, that reflected on opportunities, challenges and priorities of organizing, representing and analysing data, information and knowledge in biomedicine and health care. The contributions of experts with a variety of backgrounds in the area of biomedical data analysis have been collected as one outcome of this symposium, in order to provide a broad, though coherent, overview of some of the most interesting aspects of the field. Results The paper presents sections on data accumulation and data-driven approaches in medical informatics, data and knowledge integration, statistical issues for the evaluation of data mining models, translational bioinformatics and bioinformatics aspects of genetic epidemiology. Conclusions Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context. In the future, it will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers. PMID:22146916

  18. GEA CRDA Range Data Analysis

    Science.gov (United States)

    1999-07-28

    E1, July-August 1998 18 3.3. Example 3: SatMex, Solidaridad 2, May-June 1998 27 3.4. Example 4: PanAmSat, Galaxy IV, May-June 1998 33 3.5...17 Millstone measurements residuals for Telstar 401 on Days 181-263. 26 3-18 Millstone measurement residuals for Solidaridad 1 on Days 141-153...with 29 SatMex range data. 3-19 Hermosillo B-- Solidaridad 1 range residuals through Days 135-144 with bias 30 removed. 3-20 Iztapalapa D

  19. DataSHIELD : taking the analysis to the data, not the data to the analysis

    NARCIS (Netherlands)

    Gaye, Amadou; Marcon, Yannick; Isaeva, Julia; LaFlamme, Philippe; Turner, Andrew; Jones, Elinor M.; Minion, Joel; Boyd, Andrew W.; Newby, Christopher J.; Nuotio, Marja-Liisa; Wilson, Rebecca; Butters, Oliver; Murtagh, Barnaby; Demir, Ipek; Doiron, Dany; Giepmans, Lisette; Wallace, Susan E.; Budin-Ljosne, Isabelle; Schmidt, Carsten Oliver; Boffetta, Paolo; Boniol, Mathieu; Bota, Maria; Carter, Kim W.; deKlerk, Nick; Dibben, Chris; Francis, Richard W.; Hiekkalinna, Tero; Hveem, Kristian; Kvaloy, Kirsti; Millar, Sean; Perry, Ivan J.; Peters, Annette; Phillips, Catherine M.; Popham, Frank; Raab, Gillian; Reischl, Eva; Sheehan, Nuala; Waldenberger, Melanie; Perola, Markus; van den Heuvel, Edwin; Macleod, John; Knoppers, Bartha M.; Stolk, Ronald P.; Fortier, Isabel; Harris, Jennifer R.; Woffenbuttel, Bruce H. R.; Murtagh, Madeleine J.; Ferretti, Vincent; Burton, Paul R.

    2014-01-01

    Background: Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling of information from individuals in a central database that may be queried by researchers raises

  20. Data analysis using a data base driven graphics animation system

    International Nuclear Information System (INIS)

    Schwieder, D.H.; Stewart, H.D.; Curtis, J.N.

    1985-01-01

    A graphics animation system has been developed at the Idaho National Engineering Laboratory (INEL) to assist engineers in the analysis of large amounts of time series data. Most prior attempts at computer animation of data involve the development of large and expensive problem-specific systems. This paper discusses a generalized interactive computer animation system designed to be used in a wide variety of data analysis applications. By using relational data base storage of graphics and control information, considerable flexibility in design and development of animated displays is achieved

  1. Complex Visual Data Analysis, Uncertainty, and Representation

    National Research Council Canada - National Science Library

    Schunn, Christian D; Saner, Lelyn D; Kirschenbaum, Susan K; Trafton, J. G; Littleton, Eliza B

    2007-01-01

    ... (weather forecasting, submarine target motion analysis, and fMRI data analysis). Internal spatial representations are coded from spontaneous gestures made during cued-recall summaries of problem solving activities...

  2. Mobile networks for biometric data analysis

    CERN Document Server

    Madrid, Natividad; Seepold, Ralf; Orcioni, Simone

    2016-01-01

    This book showcases new and innovative approaches to biometric data capture and analysis, focusing especially on those that are characterized by non-intrusiveness, reliable prediction algorithms, and high user acceptance. It comprises the peer-reviewed papers from the international workshop on the subject that was held in Ancona, Italy, in October 2014 and featured sessions on ICT for health care, biometric data in automotive and home applications, embedded systems for biometric data analysis, biometric data analysis: EMG and ECG, and ICT for gait analysis. The background to the book is the challenge posed by the prevention and treatment of common, widespread chronic diseases in modern, aging societies. Capture of biometric data is a cornerstone for any analysis and treatment strategy. The latest advances in sensor technology allow accurate data measurement in a non-intrusive way, and in many cases it is necessary to provide online monitoring and real-time data capturing to support a patient’s prevention pl...

  3. Global Analysis of Neutrino Data

    CERN Document Server

    González-Garciá, M C

    2005-01-01

    In this talk I review the present status of neutrino masses and mixing and some of their implications for particle physics phenomenology. I first discuss the minimum extension of the Standard Model of particle physics required to accommodate neutrino masses and introduce the new parameters present in the model and in particular the possibility of leptonic mixing. I then describe the phenomenology of neutrino masses and mixing leading to flavour oscillations and present the existing evidence from solar, reactor, atmospheric and long-baseline neutrinos as well as the results from laboratory searches at short distances. I derive the allowed ranges for the mass and mixing parameters when the bulk of data is consistently analyzed in the framework of mixing between the three active neutrinos and obtain as a result the most up-to-date determination of the leptonic mixing matrix. Then I briefly summarize the status of some proposed phenomenological explanations to accommodate the LSND results: the role of sterile neu...

  4. Tornado detection data reduction and analysis

    Science.gov (United States)

    Davisson, L. D.

    1977-01-01

    Data processing and analysis was provided in support of tornado detection by analysis of radio frequency interference in various frequency bands. Sea state determination data from short pulse radar measurements were also processed and analyzed. A backscatter simulation was implemented to predict radar performance as a function of wind velocity. Computer programs were developed for the various data processing and analysis goals of the effort.

  5. Collecting operational event data for statistical analysis

    International Nuclear Information System (INIS)

    Atwood, C.L.

    1994-09-01

    This report gives guidance for collecting operational data to be used for statistical analysis, especially analysis of event counts. It discusses how to define the purpose of the study, the unit (system, component, etc.) to be studied, events to be counted, and demand or exposure time. Examples are given of classification systems for events in the data sources. A checklist summarizes the essential steps in data collection for statistical analysis

  6. Common Data Format (CDF) and Coordinated Data Analysis Web (CDAWeb)

    Science.gov (United States)

    Candey, Robert M.

    2010-01-01

    The Coordinated Data Analysis Web (CDAWeb) data browsing system provides plotting, listing and open access v ia FTP, HTTP, and web services (REST, SOAP, OPeNDAP) for data from mo st NASA Heliophysics missions and is heavily used by the community. C ombining data from many instruments and missions enables broad resear ch analysis and correlation and coordination with other experiments a nd missions. Crucial to its effectiveness is the use of a standard se lf-describing data format, in this case, the Common Data Format (CDF) , also developed at the Space Physics Data facility , and the use of metadata standa rds (easily edited with SKTeditor ). CDAweb is based on a set of IDL routines, CDAWlib . . The CDF project also maintains soft ware and services for translating between many standard formats (CDF. netCDF, HDF, FITS, XML) .

  7. Data Analysis in Experimental Biomedical Research

    DEFF Research Database (Denmark)

    Markovich, Dmitriy

    This thesis covers two non-related topics in experimental biomedical research: data analysis in thrombin generation experiments (collaboration with Novo Nordisk A/S), and analysis of images and physiological signals in the context of neurovascular signalling and blood flow regulation in the brain...... to critically assess and compare obtained results. We reverse engineered the data analysis performed by CAT, a de facto standard assay in the field. This revealed a number of possibilities to improve its methods of data analysis. We found that experimental calibration data is described well with textbook...

  8. Dynamic data analysis modeling data with differential equations

    CERN Document Server

    Ramsay, James

    2017-01-01

    This text focuses on the use of smoothing methods for developing and estimating differential equations following recent developments in functional data analysis and building on techniques described in Ramsay and Silverman (2005) Functional Data Analysis. The central concept of a dynamical system as a buffer that translates sudden changes in input into smooth controlled output responses has led to applications of previously analyzed data, opening up entirely new opportunities for dynamical systems. The technical level has been kept low so that those with little or no exposure to differential equations as modeling objects can be brought into this data analysis landscape. There are already many texts on the mathematical properties of ordinary differential equations, or dynamic models, and there is a large literature distributed over many fields on models for real world processes consisting of differential equations. However, a researcher interested in fitting such a model to data, or a statistician interested in...

  9. Data management, archiving, visualization and analysis of space physics data

    Science.gov (United States)

    Russell, C. T.

    1995-01-01

    A series of programs for the visualization and analysis of space physics data has been developed at UCLA. In the course of those developments, a number of lessons have been learned regarding data management and data archiving, as well as data analysis. The issues now facing those wishing to develop such software, as well as the lessons learned, are reviewed. Modern media have eased many of the earlier problems of the physical volume required to store data, the speed of access, and the permanence of the records. However, the ultimate longevity of these media is still a question of debate. Finally, while software development has become easier, cost is still a limiting factor in developing visualization and analysis software.

  10. Post-Flight Data Analysis Tool

    Science.gov (United States)

    George, Marina

    2018-01-01

    A software tool that facilitates the retrieval and analysis of post-flight data. This allows our team and other teams to effectively and efficiently analyze and evaluate post-flight data in order to certify commercial providers.

  11. Integrating Data Transformation in Principal Components Analysis

    KAUST Repository

    Maadooliat, Mehdi; Huang, Jianhua Z.; Hu, Jianhua

    2015-01-01

    Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior

  12. Nuclear data needs for material analysis

    International Nuclear Information System (INIS)

    Molnar, Gabor L.

    2001-01-01

    Nuclear data for material analysis using neutron-based methods are examined. Besides a critical review of the available data, emphasis is given to emerging application areas and new experimental techniques. Neutron scattering and reaction data, as well as decay data for delayed and prompt gamma activation analysis are all discussed in detail. Conclusions are formed concerning the need of new measurement, calculation, evaluation and dissemination activities. (author)

  13. Analysis of biomarker data a practical guide

    CERN Document Server

    Looney, Stephen W

    2015-01-01

    A "how to" guide for applying statistical methods to biomarker data analysis Presenting a solid foundation for the statistical methods that are used to analyze biomarker data, Analysis of Biomarker Data: A Practical Guide features preferred techniques for biomarker validation. The authors provide descriptions of select elementary statistical methods that are traditionally used to analyze biomarker data with a focus on the proper application of each method, including necessary assumptions, software recommendations, and proper interpretation of computer output. In addition, the book discusses

  14. Data analysis of event tape and connection

    International Nuclear Information System (INIS)

    Gong Huili

    1995-01-01

    The data analysis on the VAX-11/780 computer is briefly described, the data is from the recorded event tape of JUHU data acquisition system on the PDP-11/44 computer. The connection of the recorded event tapes of the XSYS data acquisition system on VAX computer is also introduced

  15. The Data Party: Involving Stakeholders in Meaningful Data Analysis

    Science.gov (United States)

    Franz, Nancy K.

    2013-01-01

    A hallmark of Extension includes the involvement of stakeholders in research and program needs assessment, design, implementation, evaluation, and reporting. A data party can be used to enhance this stakeholder involvement specifically in data analysis. This type of event can not only increase client participation in Extension programming and…

  16. Spatiotemporal Data Mining, Analysis, and Visualization of Human Activity Data

    Science.gov (United States)

    Li, Xun

    2012-01-01

    This dissertation addresses the research challenge of developing efficient new methods for discovering useful patterns and knowledge in large volumes of electronically collected spatiotemporal activity data. I propose to analyze three types of such spatiotemporal activity data in a methodological framework that integrates spatial analysis, data…

  17. Classification, (big) data analysis and statistical learning

    CERN Document Server

    Conversano, Claudio; Vichi, Maurizio

    2018-01-01

    This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pul...

  18. A program for activation analysis data processing

    International Nuclear Information System (INIS)

    Janczyszyn, J.; Loska, L.; Taczanowski, S.

    1978-01-01

    An ALGOL program for activation analysis data handling is presented. The program may be used either for single channel spectrometry data or for multichannel spectrometry. The calculation of instrumental error and of analysis standard deviation is carried out. The outliers are tested, and the regression line diagram with the related observations are plotted by the program. (author)

  19. Textbooks for Responsible Data Analysis in Excel

    Science.gov (United States)

    Garrett, Nathan

    2015-01-01

    With 27 million users, Excel (Microsoft Corporation, Seattle, WA) is the most common business data analysis software. However, audits show that almost all complex spreadsheets have errors. The author examined textbooks to understand why responsible data analysis is taught. A purposeful sample of 10 textbooks was coded, and then compared against…

  20. Data analysis techniques for gravitational wave observations

    Indian Academy of Sciences (India)

    Astrophysical sources of gravitational waves fall broadly into three categories: (i) transient and bursts, (ii) periodic or continuous wave and (iii) stochastic. Each type of source requires a different type of data analysis strategy. In this talk various data analysis strategies will be reviewed. Optimal filtering is used for extracting ...

  1. Power analysis of trials with multilevel data

    CERN Document Server

    Moerbeek, Mirjam

    2015-01-01

    Power Analysis of Trials with Multilevel Data covers using power and sample size calculations to design trials that involve nested data structures. The book gives a thorough overview of power analysis that details terminology and notation, outlines key concepts of statistical power and power analysis, and explains why they are necessary in trial design. It guides you in performing power calculations with hierarchical data, which enables more effective trial design.The authors are leading experts in the field who recognize that power analysis has attracted attention from applied statisticians i

  2. An Array of Qualitative Data Analysis Tools: A Call for Data Analysis Triangulation

    Science.gov (United States)

    Leech, Nancy L.; Onwuegbuzie, Anthony J.

    2007-01-01

    One of the most important steps in the qualitative research process is analysis of data. The purpose of this article is to provide elements for understanding multiple types of qualitative data analysis techniques available and the importance of utilizing more than one type of analysis, thus utilizing data analysis triangulation, in order to…

  3. Data analysis for the LISA Technology Package

    International Nuclear Information System (INIS)

    Hewitson, M; Danzmann, K; Diepholz, I; GarcIa, A; Armano, M; Fauste, J; Benedetti, M; Bogenstahl, J; Bortoluzzi, D; Bosetti, P; Cristofolini, I; Brandt, N; Cavalleri, A; Ciani, G; Dolesi, R; Ferraioli, L; Cruise, M; Fertin, D; GarcIa, C; Fichter, W

    2009-01-01

    The LISA Technology Package (LTP) on board the LISA Pathfinder mission aims to demonstrate some key concepts for LISA which cannot be tested on ground. The mission consists of a series of preplanned experimental runs. The data analysis for each experiment must be designed in advance of the mission. During the mission, the analysis must be carried out promptly so that the results can be fed forward into subsequent experiments. As such a robust and flexible data analysis environment needs to be put in place. Since this software is used during mission operations and effects the mission timeline, it must be very robust and tested to a high degree. This paper presents the requirements, design and implementation of the data analysis environment (LTPDA) that will be used for analysing the data from LTP. The use of the analysis software to perform mock data challenges (MDC) is also discussed, and some highlights from the first MDC are presented.

  4. Data analysis for the LISA Technology Package

    Energy Technology Data Exchange (ETDEWEB)

    Hewitson, M; Danzmann, K; Diepholz, I; GarcIa, A [Albert-Einstein-Institut, Max-Planck-Institut fuer Gravitationsphysik und Universitaet Hannover, 30167 Hannover (Germany); Armano, M; Fauste, J [European Space Agency, ESAC, Villanueva de la Canada, 28692 Madrid (Spain); Benedetti, M [Dipartimento di Ingegneria dei Materiali e Tecnologie Industriali, Universita di Trento and INFN, Gruppo Collegato di Trento, Mesiano, Trento (Italy); Bogenstahl, J [Department of Physics and Astronomy, University of Glasgow, Glasgow (United Kingdom); Bortoluzzi, D; Bosetti, P; Cristofolini, I [Dipartimento di Ingegneria Meccanica e Strutturale, Universita di Trento and INFN, Gruppo Collegato di Trento, Mesiano, Trento (Italy); Brandt, N [Astrium GmbH, 88039 Friedrichshafen (Germany); Cavalleri, A; Ciani, G; Dolesi, R; Ferraioli, L [Dipartimento di Fisica, Universita di Trento and INFN, Gruppo Collegato di Trento, 38050 Povo, Trento (Italy); Cruise, M [Department of Physics and Astronomy, University of Birmingham, Birmingham (United Kingdom); Fertin, D; GarcIa, C [European Space Agency, ESTEC, 2200 AG Noordwijk (Netherlands); Fichter, W, E-mail: martin.hewitson@aei.mpg.d [Institut fuer Flugmechanik und Flugregelung, 70569 Stuttgart (Germany)

    2009-05-07

    The LISA Technology Package (LTP) on board the LISA Pathfinder mission aims to demonstrate some key concepts for LISA which cannot be tested on ground. The mission consists of a series of preplanned experimental runs. The data analysis for each experiment must be designed in advance of the mission. During the mission, the analysis must be carried out promptly so that the results can be fed forward into subsequent experiments. As such a robust and flexible data analysis environment needs to be put in place. Since this software is used during mission operations and effects the mission timeline, it must be very robust and tested to a high degree. This paper presents the requirements, design and implementation of the data analysis environment (LTPDA) that will be used for analysing the data from LTP. The use of the analysis software to perform mock data challenges (MDC) is also discussed, and some highlights from the first MDC are presented.

  5. A practical guide to scientific data analysis

    CERN Document Server

    Livingstone, David J

    2009-01-01

    Inspired by the author's need for practical guidance in the processes of data analysis, A Practical Guide to Scientific Data Analysis has been written as a statistical companion for the working scientist.  This handbook of data analysis with worked examples focuses on the application of mathematical and statistical techniques and the interpretation of their results. Covering the most common statistical methods for examining and exploring relationships in data, the text includes extensive examples from a variety of scientific disciplines. The chapters are organised logically, from pl

  6. Topological data analysis for scientific visualization

    CERN Document Server

    Tierny, Julien

    2017-01-01

    Combining theoretical and practical aspects of topology, this book delivers a comprehensive and self-contained introduction to topological methods for the analysis and visualization of scientific data. Theoretical concepts are presented in a thorough but intuitive manner, with many high-quality color illustrations. Key algorithms for the computation and simplification of topological data representations are described in details, and their application is carefully illustrated in a chapter dedicated to concrete use cases. With its fine balance between theory and practice, "Topological Data Analysis for Scientific Visualization" constitutes an appealing introduction to the increasingly important topic of topological data analysis, for lecturers, students and researchers.

  7. Roadside video data analysis deep learning

    CERN Document Server

    Verma, Brijesh; Stockwell, David

    2017-01-01

    This book highlights the methods and applications for roadside video data analysis, with a particular focus on the use of deep learning to solve roadside video data segmentation and classification problems. It describes system architectures and methodologies that are specifically built upon learning concepts for roadside video data processing, and offers a detailed analysis of the segmentation, feature extraction and classification processes. Lastly, it demonstrates the applications of roadside video data analysis including scene labelling, roadside vegetation classification and vegetation biomass estimation in fire risk assessment.

  8. SOLE: enhanced FIA data analysis capabilities

    Science.gov (United States)

    Michael Spinney; Paul Van Deusen

    2009-01-01

    The Southern On Line Estimator (SOLE), is an Internet-based annual forest inventory and analysis (FIA) data analysis tool developed cooperatively by the National Council for Air and Stream Improvement and the Forest Service, U.S. Department of Agriculture's Forest Inventory and Analysis program at the Southern Research Station. Recent development of SOLE has...

  9. Gaussian process regression analysis for functional data

    CERN Document Server

    Shi, Jian Qing

    2011-01-01

    Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime

  10. Computer-assisted qualitative data analysis software.

    Science.gov (United States)

    Cope, Diane G

    2014-05-01

    Advances in technology have provided new approaches for data collection methods and analysis for researchers. Data collection is no longer limited to paper-and-pencil format, and numerous methods are now available through Internet and electronic resources. With these techniques, researchers are not burdened with entering data manually and data analysis is facilitated by software programs. Quantitative research is supported by the use of computer software and provides ease in the management of large data sets and rapid analysis of numeric statistical methods. New technologies are emerging to support qualitative research with the availability of computer-assisted qualitative data analysis software (CAQDAS).CAQDAS will be presented with a discussion of advantages, limitations, controversial issues, and recommendations for this type of software use.

  11. DATA ANALYSIS BY SQL-MAPREDUCE PLATFORM

    Directory of Open Access Journals (Sweden)

    A. A. A. Dergachev

    2014-01-01

    Full Text Available The paper deals with the problems related to the usage of relational database management system (RDBMS, mainly in the analysis of large data content, including data analysis based on web services in the Internet. A solution of these problems can be represented as a web-oriented distributed system of the data analysis with the processor of service requests as an executive kernel. The functions of such system are similar to the functions of relational DBMS, only with the usage of web services. The processor of service requests is responsible for planning of data analysis web services calls and their execution. The efficiency of such web-oriented system depends on the efficiency of web services calls plan and their program implementation where the basic element is the facilities of analyzed data storage – relational DBMS. The main attention is given to extension of functionality of relational DBMS for the analysis of large data content, in particular, the perspective estimation of web services data analysis implementation on the basis of SQL/MapReduce platform. With a view of obtaining this result, analytical task was chosen as an application-oriented part, typical for data analysis in various social networks and web portals, based on data analysis of users’ attendance. In the practical part of this research the algorithm for planning of web services calls was implemented for application-oriented task solution. SQL/MapReduce platform efficiency is confirmed by experimental results that show the opportunity of effective application for data analysis web services.

  12. Analysis of Home Health Sensor Data

    NARCIS (Netherlands)

    Kröse, B.; van Hoof, J.; Demiris, G.; Wouters, E.J.M.

    2014-01-01

    This chapter focuses on the analysis of data that is collected from sensors in the home environment. First we discuss the need for a good model that relates sensor data (or features derived from the data) to indicators of health and well-being. Then we present several methods for model building. We

  13. Integrative Analysis of Omics Big Data.

    Science.gov (United States)

    Yu, Xiang-Tian; Zeng, Tao

    2018-01-01

    The diversity and huge omics data take biology and biomedicine research and application into a big data era, just like that popular in human society a decade ago. They are opening a new challenge from horizontal data ensemble (e.g., the similar types of data collected from different labs or companies) to vertical data ensemble (e.g., the different types of data collected for a group of person with match information), which requires the integrative analysis in biology and biomedicine and also asks for emergent development of data integration to address the great changes from previous population-guided to newly individual-guided investigations.Data integration is an effective concept to solve the complex problem or understand the complicate system. Several benchmark studies have revealed the heterogeneity and trade-off that existed in the analysis of omics data. Integrative analysis can combine and investigate many datasets in a cost-effective reproducible way. Current integration approaches on biological data have two modes: one is "bottom-up integration" mode with follow-up manual integration, and the other one is "top-down integration" mode with follow-up in silico integration.This paper will firstly summarize the combinatory analysis approaches to give candidate protocol on biological experiment design for effectively integrative study on genomics and then survey the data fusion approaches to give helpful instruction on computational model development for biological significance detection, which have also provided newly data resources and analysis tools to support the precision medicine dependent on the big biomedical data. Finally, the problems and future directions are highlighted for integrative analysis of omics big data.

  14. Critical Data Analysis Precedes Soft Computing Of Medical Data

    DEFF Research Database (Denmark)

    Keyserlingk, Diedrich Graf von; Jantzen, Jan; Berks, G.

    2000-01-01

    extracted. The factors had different relationships (loadings) to the symptoms. Although the factors were gained only by computations, they seemed to express some modular features of the language disturbances. This phenomenon, that factors represent superior aspects of data, is well known in factor analysis...... the deficits in communication. Sets of symptoms corresponding to the traditional symptoms in Broca and Wernicke aphasia may be represented in the factors, but the factor itself does not represent a syndrome. It is assumed that this kind of data analysis shows a new approach to the understanding of language...

  15. Preparing data for analysis using microsoft Excel.

    Science.gov (United States)

    Elliott, Alan C; Hynan, Linda S; Reisch, Joan S; Smith, Janet P

    2006-09-01

    A critical component essential to good research is the accurate and efficient collection and preparation of data for analysis. Most medical researchers have little or no training in data management, often causing not only excessive time spent cleaning data but also a risk that the data set contains collection or recording errors. The implementation of simple guidelines based on techniques used by professional data management teams will save researchers time and money and result in a data set better suited to answer research questions. Because Microsoft Excel is often used by researchers to collect data, specific techniques that can be implemented in Excel are presented.

  16. Advances in Risk Analysis with Big Data.

    Science.gov (United States)

    Choi, Tsan-Ming; Lambert, James H

    2017-08-01

    With cloud computing, Internet-of-things, wireless sensors, social media, fast storage and retrieval, etc., organizations and enterprises have access to unprecedented amounts and varieties of data. Current risk analysis methodology and applications are experiencing related advances and breakthroughs. For example, highway operations data are readily available, and making use of them reduces risks of traffic crashes and travel delays. Massive data of financial and enterprise systems support decision making under risk by individuals, industries, regulators, etc. In this introductory article, we first discuss the meaning of big data for risk analysis. We then examine recent advances in risk analysis with big data in several topic areas. For each area, we identify and introduce the relevant articles that are featured in the special issue. We conclude with a discussion on future research opportunities. © 2017 Society for Risk Analysis.

  17. Robust statistics and geochemical data analysis

    International Nuclear Information System (INIS)

    Di, Z.

    1987-01-01

    Advantages of robust procedures over ordinary least-squares procedures in geochemical data analysis is demonstrated using NURE data from the Hot Springs Quadrangle, South Dakota, USA. Robust principal components analysis with 5% multivariate trimming successfully guarded the analysis against perturbations by outliers and increased the number of interpretable factors. Regression with SINE estimates significantly increased the goodness-of-fit of the regression and improved the correspondence of delineated anomalies with known uranium prospects. Because of the ubiquitous existence of outliers in geochemical data, robust statistical procedures are suggested as routine procedures to replace ordinary least-squares procedures

  18. Analysis of mass spectrometry data in proteomics

    DEFF Research Database (Denmark)

    Matthiesen, Rune; Jensen, Ole N

    2008-01-01

    The systematic study of proteins and protein networks, that is, proteomics, calls for qualitative and quantitative analysis of proteins and peptides. Mass spectrometry (MS) is a key analytical technology in current proteomics and modern mass spectrometers generate large amounts of high-quality data...... that in turn allow protein identification, annotation of secondary modifications, and determination of the absolute or relative abundance of individual proteins. Advances in mass spectrometry-driven proteomics rely on robust bioinformatics tools that enable large-scale data analysis. This chapter describes...... some of the basic concepts and current approaches to the analysis of MS and MS/MS data in proteomics....

  19. Advanced Excel for scientific data analysis

    CERN Document Server

    De Levie, Robert

    2004-01-01

    Excel is by far the most widely distributed data analysis software but few users are aware of its full powers. Advanced Excel For Scientific Data Analysis takes off from where most books dealing with scientific applications of Excel end. It focuses on three areas-least squares, Fourier transformation, and digital simulation-and illustrates these with extensive examples, often taken from the literature. It also includes and describes a number of sample macros and functions to facilitate common data analysis tasks. These macros and functions are provided in uncompiled, computer-readable, easily

  20. Quality Analysis of Open Street Map Data

    Science.gov (United States)

    Wang, M.; Li, Q.; Hu, Q.; Zhou, M.

    2013-05-01

    Crowd sourcing geographic data is an opensource geographic data which is contributed by lots of non-professionals and provided to the public. The typical crowd sourcing geographic data contains GPS track data like OpenStreetMap, collaborative map data like Wikimapia, social websites like Twitter and Facebook, POI signed by Jiepang user and so on. These data will provide canonical geographic information for pubic after treatment. As compared with conventional geographic data collection and update method, the crowd sourcing geographic data from the non-professional has characteristics or advantages of large data volume, high currency, abundance information and low cost and becomes a research hotspot of international geographic information science in the recent years. Large volume crowd sourcing geographic data with high currency provides a new solution for geospatial database updating while it need to solve the quality problem of crowd sourcing geographic data obtained from the non-professionals. In this paper, a quality analysis model for OpenStreetMap crowd sourcing geographic data is proposed. Firstly, a quality analysis framework is designed based on data characteristic analysis of OSM data. Secondly, a quality assessment model for OSM data by three different quality elements: completeness, thematic accuracy and positional accuracy is presented. Finally, take the OSM data of Wuhan for instance, the paper analyses and assesses the quality of OSM data with 2011 version of navigation map for reference. The result shows that the high-level roads and urban traffic network of OSM data has a high positional accuracy and completeness so that these OSM data can be used for updating of urban road network database.

  1. QUAGOL: a guide for qualitative data analysis.

    Science.gov (United States)

    Dierckx de Casterlé, Bernadette; Gastmans, Chris; Bryon, Els; Denier, Yvonne

    2012-03-01

    Data analysis is a complex and contested part of the qualitative research process, which has received limited theoretical attention. Researchers are often in need of useful instructions or guidelines on how to analyze the mass of qualitative data, but face the lack of clear guidance for using particular analytic methods. The aim of this paper is to propose and discuss the Qualitative Analysis Guide of Leuven (QUAGOL), a guide that was developed in order to be able to truly capture the rich insights of qualitative interview data. The article describes six major problems researchers are often struggling with during the process of qualitative data analysis. Consequently, the QUAGOL is proposed as a guide to facilitate the process of analysis. Challenges emerged and lessons learned from own extensive experiences with qualitative data analysis within the Grounded Theory Approach, as well as from those of other researchers (as described in the literature), were discussed and recommendations were presented. Strengths and pitfalls of the proposed method were discussed in detail. The Qualitative Analysis Guide of Leuven (QUAGOL) offers a comprehensive method to guide the process of qualitative data analysis. The process consists of two parts, each consisting of five stages. The method is systematic but not rigid. It is characterized by iterative processes of digging deeper, constantly moving between the various stages of the process. As such, it aims to stimulate the researcher's intuition and creativity as optimal as possible. The QUAGOL guide is a theory and practice-based guide that supports and facilitates the process of analysis of qualitative interview data. Although the method can facilitate the process of analysis, it cannot guarantee automatic quality. The skills of the researcher and the quality of the research team remain the most crucial components of a successful process of analysis. Additionally, the importance of constantly moving between the various stages

  2. Control, data acquisition, data analysis and remote participation in LHD

    International Nuclear Information System (INIS)

    Nagayama, Y.; Emoto, M.; Nakanishi, H.; Sudo, S.; Imazu, S.; Inagaki, S.; Iwata, C.; Kojima, M.; Nonomura, M.; Ohsuna, M.; Tsuda, K.; Yoshida, M.; Chikaraishi, H.; Funaba, H.; Horiuchi, R.; Ishiguro, S.; Ito, Y.; Kubo, S.; Mase, A.; Mito, T.

    2008-01-01

    This paper presents the control, data acquisition, data analysis and remote participation facilities of the Large Helical Device (LHD), which is designed to confine the plasma in steady state. In LHD the plasma duration exceeds 3000 s by controlling the plasma position, the density and the ICRF heating. The 'LABCOM' data acquisition system takes both the short-pulse and the steady-state data. A two-layer Mass Storage System with RAIDs and Blu-ray Disk jukeboxes in a storage area network has been developed to increase capacity of storage. The steady-state data can be monitored with a Web browser in real time. A high-level data analysis system with Web interfaces is being developed in order to provide easier usage of LHD data and large FORTRAN codes in a supercomputer. A virtual laboratory system for the Japanese fusion community has been developed with Multi-protocol Label Switching Virtual Private Network Technology. Collaborators at remote sites can join the LHD experiment or use the NIFS supercomputer system as if they were working in the LHD control room

  3. An Analysis of the Climate Data Initiative's Data Collection

    Science.gov (United States)

    Ramachandran, R.; Bugbee, K.

    2015-12-01

    The Climate Data Initiative (CDI) is a broad multi-agency effort of the U.S. government that seeks to leverage the extensive existing federal climate-relevant data to stimulate innovation and private-sector entrepreneurship to support national climate-change preparedness. The CDI project is a systematic effort to manually curate and share openly available climate data from various federal agencies. To date, the CDI has curated seven themes, or topics, relevant to climate change resiliency. These themes include Coastal Flooding, Food Resilience, Water, Ecosystem Vulnerability, Human Health, Energy Infrastructure, and Transportation. Each theme was curated by subject matter experts who selected datasets relevant to the topic at hand. An analysis of the entire Climate Data Initiative data collection and the data curated for each theme offers insights into which datasets are considered most relevant in addressing climate resiliency. Other aspects of the data collection will be examined including which datasets were the most visited or popular and which datasets were the most sought after for curation by the theme teams. Results from the analysis of the CDI collection will be presented in this talk.

  4. Trip generation and data analysis study.

    Science.gov (United States)

    2015-09-01

    Through the Trip Generation and Data Analysis Study, the District of Columbia Department of : Transportation (DDOT) is undertaking research to better understand multimodal urban trip generation : at mixed-use sites in the District. The study is helpi...

  5. Compass 2011 data analysis and reporting.

    Science.gov (United States)

    2013-05-01

    Past efforts include data analysis and reporting performance and outcomes for signs, pavement, shoulders, roadsides, drainage, traffic, and bridges. In : the 2005 Compass report, measures for bridge inspection and maintenance were added, and historic...

  6. Compass 2012 data analysis and reporting.

    Science.gov (United States)

    2014-05-01

    Past efforts include data analysis and reporting performance and outcomes for signs, pavement, shoulders, roadsides, drainage, traffic, and bridges. In : the 2005 Compass report, measures for bridge inspection and maintenance were added, and historic...

  7. Identifiable Data Files - Medicare Provider Analysis and ...

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Medicare Provider Analysis and Review (MEDPAR) File contains data from claims for services provided to beneficiaries admitted to Medicare certified inpatient...

  8. Statistical methods for categorical data analysis

    CERN Document Server

    Powers, Daniel

    2008-01-01

    This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/

  9. Analysis of mixed data methods & applications

    CERN Document Server

    de Leon, Alexander R

    2013-01-01

    A comprehensive source on mixed data analysis, Analysis of Mixed Data: Methods & Applications summarizes the fundamental developments in the field. Case studies are used extensively throughout the book to illustrate interesting applications from economics, medicine and health, marketing, and genetics. Carefully edited for smooth readability and seamless transitions between chaptersAll chapters follow a common structure, with an introduction and a concluding summary, and include illustrative examples from real-life case studies in developmental toxicolog

  10. TREX13 Data Analysis/Modeling

    Science.gov (United States)

    2018-03-29

    From: Dajun Tang, Principal Investigator Subj: ONR Grant# N00014-14-1-0239 & N00014-16-1-2371, “TREX 13 Data analysis /Modeling” Encl: (1) Final...Performance/ Technical Report with SF298 The attached enclosures constitute the final technical report for ONR Grant# N00014-14-1-0239 & N00014-16-1-2371...TREX 13 Data analysis /Modeling” cc: Grant & Contract Administrator, APL-UW Office of Sponsor Programs, UW ONR Seattle – Robert Rice and

  11. Planning representation for automated exploratory data analysis

    Science.gov (United States)

    St. Amant, Robert; Cohen, Paul R.

    1994-03-01

    Igor is a knowledge-based system for exploratory statistical analysis of complex systems and environments. Igor has two related goals: to help automate the search for interesting patterns in data sets, and to help develop models that capture significant relationships in the data. We outline a language for Igor, based on techniques of opportunistic planning, which balances control and opportunism. We describe the application of Igor to the analysis of the behavior of Phoenix, an artificial intelligence planning system.

  12. SIMONE: Tool for Data Analysis and Simulation

    International Nuclear Information System (INIS)

    Chudoba, V.; Hnatio, B.; Sharov, P.; Papka, Paul

    2013-06-01

    SIMONE is a software tool based on the ROOT Data Analysis Framework and developed in collaboration of FLNR JINR and iThemba LABS. It is intended for physicists planning experiments and analysing experimental data. The goal of the SIMONE framework is to provide a flexible system, user friendly, efficient and well documented. It is intended for simulation of a wide range of Nuclear Physics experiments. The most significant conditions and physical processes can be taken into account during simulation of the experiment. The user can create his own experimental setup through the access of predefined detector geometries. Simulated data is made available in the same format as for the real experiment for identical analysis of both experimental and simulated data. Significant time reduction is expected during experiment planning and data analysis. (authors)

  13. Analysis of Nigerian Hydrometeorological Data | Dike | Nigerian ...

    African Journals Online (AJOL)

    Missing records were determined by the mass curve analysis for rainfall and regression analysis for runoff involving runoff data at neighbouring site. Tests on time homogeneity, showed that the annual rainfall records at Port Harcourt, Enugu and Lokoja were stationary and random, the annual runoff records of River Niger at ...

  14. Mining survey data for SWOT analysis

    OpenAIRE

    Phadermrod, Boonyarat

    2016-01-01

    Strengths, Weaknesses, Opportunities and Threats (SWOT) analysis is one of the most important tools for strategic planning. The traditional method of conducting SWOT analysis does not prioritize and is likely to hold subjective views that may result in an improper strategic action. Accordingly, this research exploits Importance-Performance Analysis (IPA), a technique for measuring customers’ satisfaction based on survey data, to systematically generate prioritized SWOT factors based on custom...

  15. Integrated analysis of genetic data with R

    Directory of Open Access Journals (Sweden)

    Zhao Jing

    2006-01-01

    Full Text Available Abstract Genetic data are now widely available. There is, however, an apparent lack of concerted effort to produce software systems for statistical analysis of genetic data compared with other fields of statistics. It is often a tremendous task for end-users to tailor them for particular data, especially when genetic data are analysed in conjunction with a large number of covariates. Here, R http://www.r-project.org, a free, flexible and platform-independent environment for statistical modelling and graphics is explored as an integrated system for genetic data analysis. An overview of some packages currently available for analysis of genetic data is given. This is followed by examples of package development and practical applications. With clear advantages in data management, graphics, statistical analysis, programming, internet capability and use of available codes, it is a feasible, although challenging, task to develop it into an integrated platform for genetic analysis; this will require the joint efforts of many researchers.

  16. RECOG-ORNL, Pattern Recognition Data Analysis

    International Nuclear Information System (INIS)

    Begovich, C.L.; Larson, N.M.

    2000-01-01

    Description of program or function: RECOG-ORNL, a general-purpose pattern recognition code, is a modification of the RECOG program, written at Lawrence Livermore National Laboratory. RECOG-ORNL contains techniques for preprocessing, analyzing, and displaying data, and for unsupervised and supervised learning. Data preprocessing routines transform the data into useful representations by auto-calling, selecting important variables, and/or adding products or transformations of the variables of the data set. Data analysis routines use correlations to evaluate the data and interrelationships among the data. Display routines plot the multidimensional patterns in two dimensions or plot histograms, patterns, or one variable versus another. Unsupervised learning techniques search for classes contained inherently in the data. Supervised learning techniques use known information about some of the data to generate predicted properties for an unknown set

  17. Multivariate Analysis of Industrial Scale Fermentation Data

    DEFF Research Database (Denmark)

    Mears, Lisa; Nørregård, Rasmus; Stocks, Stuart M.

    2015-01-01

    Multivariate analysis allows process understanding to be gained from the vast and complex datasets recorded from fermentation processes, however the application of such techniques to this field can be limited by the data pre-processing requirements and data handling. In this work many iterations...

  18. Functional data analysis of sleeping energy expenditure

    Science.gov (United States)

    Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of ...

  19. The Statistical Analysis of Failure Time Data

    CERN Document Server

    Kalbfleisch, John D

    2011-01-01

    Contains additional discussion and examples on left truncation as well as material on more general censoring and truncation patterns.Introduces the martingale and counting process formulation swil lbe in a new chapter.Develops multivariate failure time data in a separate chapter and extends the material on Markov and semi Markov formulations.Presents new examples and applications of data analysis.

  20. Bayesian networks for omics data analysis

    NARCIS (Netherlands)

    Gavai, A.K.

    2009-01-01

    This thesis focuses on two aspects of high throughput technologies, i.e. data storage and data analysis, in particular in transcriptomics and metabolomics. Both technologies are part of a research field that is generally called ‘omics’ (or ‘-omics’, with a leading hyphen), which refers to genomics,

  1. Secondary Data Analysis in Family Research

    Science.gov (United States)

    Hofferth, Sandra L.

    2005-01-01

    This article first provides an overview of the part that secondary data analysis plays in the field of family studies in the early 21st century. It addresses changes over time in the use of existing omnibus data sets and discusses their advantages and disadvantages. The second part of the article focuses on the elements that make a study a…

  2. A QCD analysis of ZEUS diffractive data

    Energy Technology Data Exchange (ETDEWEB)

    Chekanov, S.; Derrick, M.; Magill, S. [Argonne National Laboratory, Argonne, IL (US)] (and others)

    2009-11-15

    ZEUS inclusive diffractive cross-section measurements have been used in a DGLAP next-to-leading-order QCD analysis to extract the diffractive parton distribution functions. Data on diffractive dijet production in deep inelastic scattering have also been included to constrain the gluon density. Predictions based on the extracted parton densities are compared to diffractive charm and dijet photoproduction data. (orig.)

  3. A QCD analysis of ZEUS diffractive data

    International Nuclear Information System (INIS)

    Chekanov, S.; Derrick, M.; Magill, S.

    2009-11-01

    ZEUS inclusive diffractive cross-section measurements have been used in a DGLAP next-to-leading-order QCD analysis to extract the diffractive parton distribution functions. Data on diffractive dijet production in deep inelastic scattering have also been included to constrain the gluon density. Predictions based on the extracted parton densities are compared to diffractive charm and dijet photoproduction data. (orig.)

  4. Big data analysis for smart farming

    NARCIS (Netherlands)

    Kempenaar, C.; Lokhorst, C.; Bleumer, E.J.B.; Veerkamp, R.F.; Been, Th.; Evert, van F.K.; Boogaardt, M.J.; Ge, L.; Wolfert, J.; Verdouw, C.N.; Bekkum, van Michael; Feldbrugge, L.; Verhoosel, Jack P.C.; Waaij, B.D.; Persie, van M.; Noorbergen, H.

    2016-01-01

    In this report we describe results of a one-year TO2 institutes project on the development of big data technologies within the milk production chain. The goal of this project is to ‘create’ an integration platform for big data analysis for smart farming and to develop a show case. This includes both

  5. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat

  6. Data archiving and analysis for CWDD

    International Nuclear Information System (INIS)

    Coleman, T.A.; Novick, A.H.; Meystrik, C.C.; Marselle, J.R.

    1992-01-01

    A computer system has been developed to handle archiving and analysis of data acquired during operations of the Continuous Wave Deuterium Demonstrator (CWDD). Data files generated by the CWDD Instrumentation and Control system are transferred across a local area network to the CWDD Archive system where they are enlisted into the archive and stored on removeable media optical disk drives. A relational database management system maintains an on-line database catalog of all archived files. This database contains information about file contents and formats, and holds signal parameter configuration tables needed to extract and interpret data from the files. Software has been developed to assist the selection and retrieval of data on demand based upon references in the catalog. Data retrieved from the archive is transferred to commercial data visualization applications for viewing, plotting and analysis

  7. Discuss on luminescence dose data analysis technology

    International Nuclear Information System (INIS)

    Ma Xinhua; Xiao Wuyun; Ai Xianyun; Shi Zhilan; Liu Ying

    2009-01-01

    This article describes the development of luminescence dose data measurement and processing technology. General design planning of luminescence dose data measurement and processing technology is put forward with the diverse demands. The emphasis is focused on dose data processing method, luminescence curve analysis method, using of network, mechanics of communication among computers, data base management system of individual dose in this paper. The main methods and skills used in this technology as well as their advantages are also discussed. And it offers general design references for development luminescence dose data processing software. (authors)

  8. Emotion Analysis on Social Big Data

    Institute of Scientific and Technical Information of China (English)

    REN Fuji; Kazuyuki Matsumoto

    2017-01-01

    In this paper, we describe a method of emotion analysis on social big data. Social big data means text data that is emerging on In-ternet social networking services.We collect multilingual web corpora and annotated emotion tags to these corpora for the purpose of emotion analysis. Because these data are constructed by manual annotation, their quality is high but their quantity is low. If we create an emotion analysis model based on this corpus with high quality and use the model for the analysis of social big data, we might be able to statistically analyze emotional sensesand behavior of the people in Internet communications, which we could not know before. In this paper, we create an emotion analysis model that integrate the high-quality emotion corpus and the automatic-constructed corpus that we created in our past studies, and then analyze a large-scale corpus consisting of Twitter tweets based on the model. As the result of time-series analysis on the large-scale corpus and the result of model evaluation, we show the effective-ness of our proposed method.

  9. The PUMA test program and data analysis

    International Nuclear Information System (INIS)

    Han, J.T.; Morrison, D.L.

    1997-01-01

    The PUMA test program is sponsored by the U.S. Nuclear Regulatory Commission to provide data that are relevant to various Boiling Water Reactor phenomena. The author briefly describes the PUMA test program and facility, presents the objective of the program, provides data analysis for a large-break loss-of-coolant accident test, and compares the data with a RELAP5/MOD 3.1.2 calculation

  10. Data structures and algorithm analysis in C++

    CERN Document Server

    Shaffer, Clifford A

    2011-01-01

    With its focus on creating efficient data structures and algorithms, this comprehensive text helps readers understand how to select or design the tools that will best solve specific problems. It uses Microsoft C++ as the programming language and is suitable for second-year data structure courses and computer science courses in algorithm analysis.Techniques for representing data are presented within the context of assessing costs and benefits, promoting an understanding of the principles of algorithm analysis and the effects of a chosen physical medium. The text also explores tradeoff issues, f

  11. Data structures and algorithm analysis in Java

    CERN Document Server

    Shaffer, Clifford A

    2011-01-01

    With its focus on creating efficient data structures and algorithms, this comprehensive text helps readers understand how to select or design the tools that will best solve specific problems. It uses Java as the programming language and is suitable for second-year data structure courses and computer science courses in algorithm analysis. Techniques for representing data are presented within the context of assessing costs and benefits, promoting an understanding of the principles of algorithm analysis and the effects of a chosen physical medium. The text also explores tradeoff issues, familiari

  12. Data Analysis of Cybercrimes in Businesses

    Directory of Open Access Journals (Sweden)

    Balan Shilpa

    2017-12-01

    Full Text Available In the current digital age, most people have become very dependent on technology for their daily work tasks. With the rise of the technological advancements, cyber-attacks have also increased. Over the past few years, there have been several security breaches. When sensitive data are breached, both organisations and consumers are affected. In the present research, we analyse the cyber security risks and its impact on organisations. To perform the analysis, a big data technology such as R programming is used. For example, using a big data analysis, it was found that the majority of businesses detected at least one incident involving a local area network (LAN breach.

  13. Geographical data structures supporting regional analysis

    International Nuclear Information System (INIS)

    Edwards, R.G.; Durfee, R.C.

    1978-01-01

    In recent years the computer has become a valuable aid in solving regional environmental problems. Over a hundred different geographic information systems have been developed to digitize, store, analyze, and display spatially distributed data. One important aspect of these systems is the data structure (e.g. grids, polygons, segments) used to model the environment being studied. This paper presents eight common geographic data structures and their use in studies of coal resources, power plant siting, population distributions, LANDSAT imagery analysis, and landuse analysis

  14. Earth Science Data Analysis in the Era of Big Data

    Science.gov (United States)

    Kuo, K.-S.; Clune, T. L.; Ramachandran, R.

    2014-01-01

    Anyone with even a cursory interest in information technology cannot help but recognize that "Big Data" is one of the most fashionable catchphrases of late. From accurate voice and facial recognition, language translation, and airfare prediction and comparison, to monitoring the real-time spread of flu, Big Data techniques have been applied to many seemingly intractable problems with spectacular successes. They appear to be a rewarding way to approach many currently unsolved problems. Few fields of research can claim a longer history with problems involving voluminous data than Earth science. The problems we are facing today with our Earth's future are more complex and carry potentially graver consequences than the examples given above. How has our climate changed? Beside natural variations, what is causing these changes? What are the processes involved and through what mechanisms are these connected? How will they impact life as we know it? In attempts to answer these questions, we have resorted to observations and numerical simulations with ever-finer resolutions, which continue to feed the "data deluge." Plausibly, many Earth scientists are wondering: How will Big Data technologies benefit Earth science research? As an example from the global water cycle, one subdomain among many in Earth science, how would these technologies accelerate the analysis of decades of global precipitation to ascertain the changes in its characteristics, to validate these changes in predictive climate models, and to infer the implications of these changes to ecosystems, economies, and public health? Earth science researchers need a viable way to harness the power of Big Data technologies to analyze large volumes and varieties of data with velocity and veracity. Beyond providing speedy data analysis capabilities, Big Data technologies can also play a crucial, albeit indirect, role in boosting scientific productivity by facilitating effective collaboration within an analysis environment

  15. Probabilistic Principal Component Analysis for Metabolomic Data.

    LENUS (Irish Health Repository)

    Nyamundanda, Gift

    2010-11-23

    Abstract Background Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model. Results Here, probabilistic principal component analysis (PPCA) which addresses some of the limitations of PCA, is reviewed and extended. A novel extension of PPCA, called probabilistic principal component and covariates analysis (PPCCA), is introduced which provides a flexible approach to jointly model metabolomic data and additional covariate information. The use of a mixture of PPCA models for discovering the number of inherent groups in metabolomic data is demonstrated. The jackknife technique is employed to construct confidence intervals for estimated model parameters throughout. The optimal number of principal components is determined through the use of the Bayesian Information Criterion model selection tool, which is modified to address the high dimensionality of the data. Conclusions The methods presented are illustrated through an application to metabolomic data sets. Jointly modeling metabolomic data and covariates was successfully achieved and has the potential to provide deeper insight to the underlying data structure. Examination of confidence intervals for the model parameters, such as loadings, allows for principled and clear interpretation of the underlying data structure. A software package called MetabolAnalyze, freely available through the R statistical software, has been developed to facilitate implementation of the presented methods in the metabolomics field.

  16. The ASDEX integrated data analysis system AIDA

    International Nuclear Information System (INIS)

    Grassie, K.; Gruber, O.; Kardaun, O.; Kaufmann, M.; Lackner, K.; Martin, P.; Mast, K.F.; McCarthy, P.J.; Mertens, V.; Pohl, D.; Rang, U.; Wunderlich, R.

    1989-11-01

    Since about two years, the ASDEX integrated data analysis system (AIDA), which combines the database (DABA) and the statistical analysis system (SAS), is successfully in operation. Besides a considerable, but meaningful, reduction of the 'raw' shot data, it offers the advantage of carefully selected and precisely defined datasets, which are easily accessible for informative tabular data overviews (DABA), and multi-shot analysis (SAS). Even rather complicated, statistical analyses can be performed efficiently within this system. In this report, we want to summarise AIDA's main features, give some details on its set-up and on the physical models which have been used for the derivation of the processed data. We also give short introduction how to use DABA and SAS. (orig.)

  17. Statistical analysis of network data with R

    CERN Document Server

    Kolaczyk, Eric D

    2014-01-01

    Networks have permeated everyday life through everyday realities like the Internet, social networks, and viral marketing. As such, network analysis is an important growth area in the quantitative sciences, with roots in social network analysis going back to the 1930s and graph theory going back centuries. Measurement and analysis are integral components of network research. As a result, statistical methods play a critical role in network analysis. This book is the first of its kind in network research. It can be used as a stand-alone resource in which multiple R packages are used to illustrate how to conduct a wide range of network analyses, from basic manipulation and visualization, to summary and characterization, to modeling of network data. The central package is igraph, which provides extensive capabilities for studying network graphs in R. This text builds on Eric D. Kolaczyk’s book Statistical Analysis of Network Data (Springer, 2009).

  18. Pengembangan Aplikasi Antarmuka Layanan Big Data Analysis

    Directory of Open Access Journals (Sweden)

    Gede Karya

    2017-11-01

    Full Text Available In the 2016 Higher Competitive Grants Research (Hibah Bersaing Dikti, we have been successfully developed models, infrastructure and modules of Hadoop-based big data analysis application. It has also successfully developed a virtual private network (VPN network that allows integration and access to the infrastructure from outside the FTIS Computer Laboratorium. Infrastructure and application modules of analysis are then wanted to be presented as services to small and medium enterprises (SMEs in Indonesia. This research aims to develop application of big data analysis service interface integrated with Hadoop-Cluster. The research begins with finding appropriate methods and techniques for scheduling jobs, calling for ready-made Java Map-Reduce (MR application modules, and techniques for tunneling input / output and meta-data construction of service request (input and service output. The above methods and techniques are then developed into a web-based service application, as well as an executable module that runs on Java and J2EE based programming environment and can access Hadoop-Cluster in the FTIS Computer Lab. The resulting application can be accessed by the public through the site http://bigdata.unpar.ac.id. Based on the test results, the application has functioned well in accordance with the specifications and can be used to perform big data analysis. Keywords: web based service, big data analysis, Hadop, J2EE Abstrak Pada penelitian Hibah Bersaing Dikti tahun 2016 telah berhasil dikembangkan model, infrastruktur dan modul-modul aplikasi big data analysis berbasis Hadoop. Selain itu juga telah berhasil dikembangkan jaringan virtual private network (VPN yang memungkinkan integrasi dan akses infrastruktur tersebut dari luar Laboratorium Komputer FTIS. Infrastruktur dan modul aplikasi analisis tersebut selanjutnya ingin dipresentasikan sebagai layanan kepada usaha kecil dan menengah (UKM di Indonesia. Penelitian ini bertujuan untuk mengembangkan

  19. Substituting missing data in compositional analysis

    Energy Technology Data Exchange (ETDEWEB)

    Real, Carlos, E-mail: carlos.real@usc.es [Area de Ecologia, Departamento de Biologia Celular y Ecologia, Escuela Politecnica Superior, Universidad de Santiago de Compostela, 27002 Lugo (Spain); Angel Fernandez, J.; Aboal, Jesus R.; Carballeira, Alejo [Area de Ecologia, Departamento de Biologia Celular y Ecologia, Facultad de Biologia, Universidad de Santiago de Compostela, 15782 Santiago de Compostela (Spain)

    2011-10-15

    Multivariate analysis of environmental data sets requires the absence of missing values or their substitution by small values. However, if the data is transformed logarithmically prior to the analysis, this solution cannot be applied because the logarithm of a small value might become an outlier. Several methods for substituting the missing values can be found in the literature although none of them guarantees that no distortion of the structure of the data set is produced. We propose a method for the assessment of these distortions which can be used for deciding whether to retain or not the samples or variables containing missing values and for the investigation of the performance of different substitution techniques. The method analyzes the structure of the distances among samples using Mantel tests. We present an application of the method to PCDD/F data measured in samples of terrestrial moss as part of a biomonitoring study. - Highlights: > Missing values in multivariate data sets must be substituted prior to analysis. > The substituted values can modify the structure of the data set. > We developed a method to estimate the magnitude of the alterations. > The method is simple and based on the Mantel test. > The method allowed the identification of problematic variables in a sample data set. - A method is presented for the assessment of the possible distortions in multivariate analysis caused by the substitution of missing values.

  20. Substituting missing data in compositional analysis

    International Nuclear Information System (INIS)

    Real, Carlos; Angel Fernandez, J.; Aboal, Jesus R.; Carballeira, Alejo

    2011-01-01

    Multivariate analysis of environmental data sets requires the absence of missing values or their substitution by small values. However, if the data is transformed logarithmically prior to the analysis, this solution cannot be applied because the logarithm of a small value might become an outlier. Several methods for substituting the missing values can be found in the literature although none of them guarantees that no distortion of the structure of the data set is produced. We propose a method for the assessment of these distortions which can be used for deciding whether to retain or not the samples or variables containing missing values and for the investigation of the performance of different substitution techniques. The method analyzes the structure of the distances among samples using Mantel tests. We present an application of the method to PCDD/F data measured in samples of terrestrial moss as part of a biomonitoring study. - Highlights: → Missing values in multivariate data sets must be substituted prior to analysis. → The substituted values can modify the structure of the data set. → We developed a method to estimate the magnitude of the alterations. → The method is simple and based on the Mantel test. → The method allowed the identification of problematic variables in a sample data set. - A method is presented for the assessment of the possible distortions in multivariate analysis caused by the substitution of missing values.

  1. Adaptive Analysis of Functional MRI Data

    International Nuclear Information System (INIS)

    Friman, Ola

    2003-01-01

    Functional Magnetic Resonance Imaging (fMRI) is a recently developed neuro-imaging technique with capacity to map neural activity with high spatial precision. To locate active brain areas, the method utilizes local blood oxygenation changes which are reflected as small intensity changes in a special type of MR images. The ability to non-invasively map brain functions provides new opportunities to unravel the mysteries and advance the understanding of the human brain, as well as to perform pre-surgical examinations in order to optimize surgical interventions. This dissertation introduces new approaches for the analysis of fMRI data. The detection of active brain areas is a challenging problem due to high noise levels and artifacts present in the data. A fundamental tool in the developed methods is Canonical Correlation Analysis (CCA). CCA is used in two novel ways. First as a method with the ability to fully exploit the spatio-temporal nature of fMRI data for detecting active brain areas. Established analysis approaches mainly focus on the temporal dimension of the data and they are for this reason commonly referred to as being mass-univariate. The new CCA detection method encompasses and generalizes the traditional mass-univariate methods and can in this terminology be viewed as a mass-multivariate approach. The concept of spatial basis functions is introduced as a spatial counterpart of the temporal basis functions already in use in fMRI analysis. The spatial basis functions implicitly perform an adaptive spatial filtering of the fMRI images, which significantly improves detection performance. It is also shown how prior information can be incorporated into the analysis by imposing constraints on the temporal and spatial models and a constrained version of CCA is devised to this end. A general Principal Component Analysis technique for generating and constraining temporal and spatial subspace models is proposed to be used in combination with the constrained CCA

  2. A Web Services Data Analysis Grid

    Energy Technology Data Exchange (ETDEWEB)

    William A Watson III; Ian Bird; Jie Chen; Bryan Hess; Andy Kowalski; Ying Chen

    2002-07-01

    The trend in large-scale scientific data analysis is to exploit compute, storage and other resources located at multiple sites, and to make those resources accessible to the scientist as if they were a single, coherent system. Web technologies driven by the huge and rapidly growing electronic commerce industry provide valuable components to speed the deployment of such sophisticated systems. Jefferson Lab, where several hundred terabytes of experimental data are acquired each year, is in the process of developing a web-based distributed system for data analysis and management. The essential aspects of this system are a distributed data grid (site independent access to experiment, simulation and model data) and a distributed batch system, augmented with various supervisory and management capabilities, and integrated using Java and XML-based web services.

  3. Statistical methods for astronomical data analysis

    CERN Document Server

    Chattopadhyay, Asis Kumar

    2014-01-01

    This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...

  4. Analysis of high-fold gamma data

    International Nuclear Information System (INIS)

    Radford, D. C.; Cromaz, M.; Beyer, C. J.

    1999-01-01

    Historically, γ-γ and γ-γ-γ coincidence spectra were utilized to build nuclear level schemes. With the development of large detector arrays, it has became possible to analyze higher fold coincidence data sets. This paper briefly reports on software to analyze 4-fold coincidence data sets that allows creation of 4-fold histograms (hypercubes) of at least 1024 channels per side (corresponding to a 43 gigachannel data space) that will fit onto a few gigabytes of disk space, and extraction of triple-gated spectra in a few seconds. Future detector arrays may have even much higher efficiencies, and detect as many as 15 or 20 γ rays simultaneously; such data will require very different algorithms for storage and analysis. Difficulties inherent in the analysis of such data are discussed, and two possible new solutions are presented, namely adaptive list-mode systems and 'list-list-mode' storage

  5. A Web Services Data Analysis Grid

    International Nuclear Information System (INIS)

    William A Watson III; Ian Bird; Jie Chen; Bryan Hess; Andy Kowalski; Ying Chen

    2002-01-01

    The trend in large-scale scientific data analysis is to exploit compute, storage and other resources located at multiple sites, and to make those resources accessible to the scientist as if they were a single, coherent system. Web technologies driven by the huge and rapidly growing electronic commerce industry provide valuable components to speed the deployment of such sophisticated systems. Jefferson Lab, where several hundred terabytes of experimental data are acquired each year, is in the process of developing a web-based distributed system for data analysis and management. The essential aspects of this system are a distributed data grid (site independent access to experiment, simulation and model data) and a distributed batch system, augmented with various supervisory and management capabilities, and integrated using Java and XML-based web services

  6. Analysis of the real EADGENE data set:

    DEFF Research Database (Denmark)

    Jaffrézic, Florence; de Koning, Dirk-Jan; Boettcher, Paul J

    2007-01-01

    A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence) WP1.4 participants for data quality control, normalisation and statistical...... methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two...... quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised...

  7. Geotechnical field data and analysis report

    International Nuclear Information System (INIS)

    1990-09-01

    The geotechnical Field Data and Analysis Report documents the geomechanical data collected at the Waste Isolation Pilot Plant up to June 30, 1989 and describes the conditions of underground openings from July 1, 1988 to June 30, 1989. The data is required to understand performance during operations and does not include data from tests performed to support performance assessment. In summary, the underground openings have performed in a satisfactory manner during the reporting period. This analysis is based primarily on the evaluation of instrumentation data, in particular the comparison of measured convergence with predictions, and the observations of exposed rock surfaces. The main concerns during this period have been the deterioration found in Site Preliminary Design Validation Test Rooms 1 and 2 and some spalling found in Panel 1. 14 refs., 45 figs., 11 tabs

  8. Using influence diagrams for data worth analysis

    International Nuclear Information System (INIS)

    Sharif Heger, A.; White, Janis E.

    1997-01-01

    Decision-making under uncertainty describes most environmental remediation and waste management problems. Inherent limitations in knowledge concerning contaminants, environmental fate and transport, remedies, and risks force decision-makers to select a course of action based on uncertain and incomplete information. Because uncertainties can be reduced by collecting additional data., uncertainty and sensitivity analysis techniques have received considerable attention. When costs associated with reducing uncertainty are considered in a decision problem, the objective changes; rather than determine what data to collect to reduce overall uncertainty, the goal is to determine what data to collect to best differentiate between possible courses of action or decision alternatives. Environmental restoration and waste management requires cost-effective methods for characterization and monitoring, and these methods must also satisfy regulatory requirements. Characterization and monitoring activities imply that, sooner or later, a decision must be made about collecting new field data. Limited fiscal resources for data collection should be committed only to those data that have the most impact on the decision at lowest possible cost. Applying influence diagrams in combination with data worth analysis produces a method which not only satisfies these requirements but also gives rise to an intuitive representation of complex structures not possible in the more traditional decision tree representation. This paper demonstrates the use of influence diagrams in data worth analysis by applying to a monitor-and-treat problem often encountered in environmental decision problems

  9. Data Analysis with Open Source Tools

    CERN Document Server

    Janert, Philipp

    2010-01-01

    Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications. Along the way, you'll experiment with conce

  10. Synthesizing Earth's geochemical data for hydrogeochemical analysis

    Science.gov (United States)

    Brantley, S. L.; Kubicki, J.; Miller, D.; Richter, D.; Giles, L.; Mitra, P.

    2007-12-01

    For over 200 years, geochemical, microbiological, and chemical data have been collected to describe the evolution of the surface earth. Many of these measurements are data showing variations in time or in space. To forward predict hydrologic response to changing tectonic, climatic, or anthropogenic forcings requires synthesis of these data and utilization in hydrogeochemical models. Increasingly, scientists are attempting to synthesize such data in order to make predictions for new regions or for future time periods. However, to make such complex geochemical data accessible requires development of sophisticated cyberinfrastructures that both invite uploading as well as usage of data. Two such cyberinfrastructure (CI) initiatives are currently developing, one to invite and promote the use of environmental kinetics data (laboratory time course data) through ChemxSeer, and the other to invite and promote the use of spatially indexed geochemical data for the Earth's Critical Zone through CZEN.org. The vision of these CI initiatives is to provide cyber-enhanced portals that encourage domain scientists to upload their data before publication (in private cyberspace), and to make these data eventually publicly accessible (after an embargo period). If the CI can be made to provide services to the domain specialist - e.g. to provide data analysis services or data comparison services - we envision that scientists will upload data. In addition, the CI can promote the use and comparison of datasets across disciplines. For example, the CI can facilitate the use of spatially indexed geochemical data by scientists more accustomed to dealing with time-course data for hydrologic flow, and can provide user-friendly interfaces with CI established to facilitate the use of hydrologic data. Examples of the usage of synthesized data to predict soil development over the last 13ky and its effects on active hydrological flow boundaries in surficial systems will be discussed for i) a N

  11. Detailed Analysis of ECMWF Surface Pressure Data

    Science.gov (United States)

    Fagiolini, E.; Schmidt, T.; Schwarz, G.; Zenner, L.

    2012-04-01

    Investigations of temporal variations within the gravity field of the Earth led us to the analysis of common surface pressure data products delivered by ECMWF. We looked into the characteristics of global as well as spatially and temporally confined phenomena being visible in the data. In particular, we were interested in the overall data quality, the local and temporal signal-to-noise ratio of surface pressure data sets, and the identification of irregular data. To this end, we analyzed a time series of a full year of surface pressure operational analysis data and their nominal standard deviations. The use of pressure data on a Gaussian grid data allowed us to remain close to the internal computations at ECMWF during data assimilation. Thus, we circumvented potential interpolation effects that would otherwise occur in cylindrical projections of conventional map products. The results obtained by us demonstrate the identification of a few distinct outliers, data quality effects over land or water and along coastlines as well as neighborhood effects of samples within and outside of the tropics. Small scale neighborhood effects depend on their geographical direction, sampling distance, land or water, and local time. In addition, one notices large scale seasonal effects that are latitude and longitude dependent. As a consequence, we obtain a cause-and-effect survey of pressure data peculiarities. One can then use background corrected pressure data to analyze seasonal effects within given latitude belts. Here time series of pressure data allow the tracking of high and low pressure areas together with the identification of their actual extent, velocity and life time. This information is vital to overall mass transport calculations and the determination of temporally varying gravity fields. However, one has to note that the satellite and ground-based instruments and the assimilation software being used for the pressure calculations will not remain the same over the years

  12. Licensing Support System: Preliminary data scope analysis

    International Nuclear Information System (INIS)

    1989-01-01

    The purpose of this analysis is to determine the content and scope of the Licensing Support System (LSS) data base. Both user needs and currently available data bases that, at least in part, address those needs have been analyzed. This analysis, together with the Preliminary Needs Analysis (DOE, 1988d) is a first effort under the LSS Design and Implementation Contract toward developing a sound requirements foundation for subsequent design work. These reports are preliminary. Further refinements must be made before requirements can be specified in sufficient detail to provide a basis for suitably specific system specifications. This document provides a baseline for what is known at this time. Additional analyses, currently being conducted, will provide more precise information on the content and scope of the LSS data base. 23 refs., 4 figs., 8 tabs

  13. A Statistical Toolkit for Data Analysis

    International Nuclear Information System (INIS)

    Donadio, S.; Guatelli, S.; Mascialino, B.; Pfeiffer, A.; Pia, M.G.; Ribon, A.; Viarengo, P.

    2006-01-01

    The present project aims to develop an open-source and object-oriented software Toolkit for statistical data analysis. Its statistical testing component contains a variety of Goodness-of-Fit tests, from Chi-squared to Kolmogorov-Smirnov, to less known, but generally much more powerful tests such as Anderson-Darling, Goodman, Fisz-Cramer-von Mises, Kuiper, Tiku. Thanks to the component-based design and the usage of the standard abstract interfaces for data analysis, this tool can be used by other data analysis systems or integrated in experimental software frameworks. This Toolkit has been released and is downloadable from the web. In this paper we describe the statistical details of the algorithms, the computational features of the Toolkit and describe the code validation

  14. A data skimming service for locally resident analysis data

    International Nuclear Information System (INIS)

    Cranshaw, J; Gieraltowski, J; Malon, D; May, E; Gardner, R W; Mambelli, M

    2008-01-01

    A Data Skimming Service (DSS) is a site-level service for rapid event filtering and selection from locally resident datasets based on metadata queries to associated 'tag' databases. In US ATLAS, we expect most if not all of the AOD-based datasets to be replicated to each of the five Tier 2 regional facilities in the US Tier 1 'cloud' coordinated by Brookhaven National Laboratory. Entire datasets will consist of on the order of several terabytes of data, and providing easy, quick access to skimmed subsets of these data will be vital to physics working groups. Typically, physicists will be interested in portions of the complete datasets, selected according to event-level attributes (number of jets, missing Et, etc) and content (specific analysis objects for subsequent processing). In this paper we describe methods used to classify data (metadata tag generation) and to store these results in a local database. Next we discuss a general framework which includes methods for accessing this information, defining skims, specifying event output content, accessing locally available storage through a variety of interfaces (SRM, dCache/dccp, gridftp), accessing remote storage elements as specified, and user job submission tools through local or grid schedulers. The advantages of the DSS are the ability to quickly 'browse' datasets and design skims, for example, pre-adjusting cuts to get to a desired skim level with minimal use of compute resources, and to encode these analysis operations in a database for re-analysis and archival purposes. Additionally the framework has provisions to operate autonomously in the event that external, central resources are not available, and to provide, as a reduced package, a minimal skimming service tailored to the needs of small Tier 3 centres or individual users

  15. Time series analysis of barometric pressure data

    International Nuclear Information System (INIS)

    La Rocca, Paola; Riggi, Francesco; Riggi, Daniele

    2010-01-01

    Time series of atmospheric pressure data, collected over a period of several years, were analysed to provide undergraduate students with educational examples of application of simple statistical methods of analysis. In addition to basic methods for the analysis of periodicities, a comparison of two forecast models, one based on autoregression algorithms, and the other making use of an artificial neural network, was made. Results show that the application of artificial neural networks may give slightly better results compared to traditional methods.

  16. Functional data analysis of sleeping energy expenditure.

    Science.gov (United States)

    Lee, Jong Soo; Zakeri, Issa F; Butte, Nancy F

    2017-01-01

    Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of SEE and to discriminate SEE between obese and non-obese children. Minute-by-minute SEE in 109 children, ages 5-18, was measured in room respiration calorimeters. A smoothing spline method was applied to the calorimetric data to extract the true smoothing function for each subject. Functional principal component analysis was used to capture the important modes of variation of the functional data and to identify differences in SEE patterns. Combinations of functional principal component analysis and classifier algorithm were used to classify SEE. Smoothing effectively removed instrumentation noise inherent in the room calorimeter data, providing more accurate data for analysis of the dynamics of SEE. SEE exhibited declining but subtly undulating patterns throughout the night. Mean SEE was markedly higher in obese than non-obese children, as expected due to their greater body mass. SEE was higher among the obese than non-obese children (p0.1, after post hoc testing). Functional principal component scores for the first two components explained 77.8% of the variance in SEE and also differed between groups (p = 0.037). Logistic regression, support vector machine or random forest classification methods were able to distinguish weight-adjusted SEE between obese and non-obese participants with good classification rates (62-64%). Our results implicate other factors, yet to be uncovered, that affect the weight-adjusted SEE of obese and non-obese children. Functional data analysis revealed differences in the structure of SEE between obese and non-obese children that may contribute to disruption of metabolic homeostasis.

  17. Microrheology with optical tweezers: data analysis

    International Nuclear Information System (INIS)

    Tassieri, Manlio; Warren, Rebecca L; Cooper, Jonathan M; Evans, R M L; Bailey, Nicholas J

    2012-01-01

    We present a data analysis procedure that provides the solution to a long-standing issue in microrheology studies, i.e. the evaluation of the fluids' linear viscoelastic properties from the analysis of a finite set of experimental data, describing (for instance) the time-dependent mean-square displacement of suspended probe particles experiencing Brownian fluctuations. We report, for the first time in the literature, the linear viscoelastic response of an optically trapped bead suspended in a Newtonian fluid, over the entire range of experimentally accessible frequencies. The general validity of the proposed method makes it transferable to the majority of microrheology and rheology techniques. (paper)

  18. Advances in statistical models for data analysis

    CERN Document Server

    Minerva, Tommaso; Vichi, Maurizio

    2015-01-01

    This edited volume focuses on recent research results in classification, multivariate statistics and machine learning and highlights advances in statistical models for data analysis. The volume provides both methodological developments and contributions to a wide range of application areas such as economics, marketing, education, social sciences and environment. The papers in this volume were first presented at the 9th biannual meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in September 2013 at the University of Modena and Reggio Emilia, Italy.

  19. Compositional Data Analysis Theory and Applications

    CERN Document Server

    Pawlowsky-Glahn, Vera

    2011-01-01

    This book presents the state-of-the-art in compositional data analysis and will feature a collection of papers covering theory, applications to various fields of science and software. Areas covered will range from geology, biology, environmental sciences, forensic sciences, medicine and hydrology. Key features:Provides the state-of-the-art text in compositional data analysisCovers a variety of subject areas, from geology to medicineWritten by leading researchers in the fieldIs supported by a website featuring R code

  20. Game data analysis tools and methods

    CERN Document Server

    Coupart, Thibault

    2013-01-01

    This book features an introduction to the basic theoretical tenets of data analysis from a game developer's point of view, as well as a practical guide to performing gameplay analysis on a real-world game.This book is ideal for video game developers who want to try and experiment with the game analytics approach for their own productions. It will provide a good overview of the themes you need to pay attention to, and will pave the way for success. Furthermore, the book also provides a wide range of concrete examples that will be useful for any game data analysts or scientists who want to impro

  1. Data analysis for physical scientists featuring Excel

    CERN Document Server

    Kirkup, Les

    2012-01-01

    The ability to summarise data, compare models and apply computer-based analysis tools are vital skills necessary for studying and working in the physical sciences. This textbook supports undergraduate students as they develop and enhance these skills. Introducing data analysis techniques, this textbook pays particular attention to the internationally recognised guidelines for calculating and expressing measurement uncertainty. This new edition has been revised to incorporate Excel® 2010. It also provides a practical approach to fitting models to data using non-linear least squares, a powerful technique which can be applied to many types of model. Worked examples using actual experimental data help students understand how the calculations apply to real situations. Over 200 in-text exercises and end-of-chapter problems give students the opportunity to use the techniques themselves and gain confidence in applying them. Answers to the exercises and problems are given at the end of the book.

  2. GaggleBridge: collaborative data analysis.

    Science.gov (United States)

    Battke, Florian; Symons, Stephan; Herbig, Alexander; Nieselt, Kay

    2011-09-15

    Tools aiding in collaborative data analysis are becoming ever more important as researchers work together over long distances. We present an extension to the Gaggle framework, which has been widely adopted as a tool to enable data exchange between different analysis programs on one computer. Our program, GaggleBridge, transparently extends this functionality to allow data exchange between Gaggle users at different geographic locations using network communication. GaggleBridge can automatically set up SSH tunnels to traverse firewalls while adding some security features to the Gaggle communication. GaggleBridge is available as open-source software implemented in the Java language at http://it.inf.uni-tuebingen.de/gb. florian.battke@uni-tuebingen.de Supplementary data are available at Bioinformatics online.

  3. Scientific data analysis on data-parallel platforms.

    Energy Technology Data Exchange (ETDEWEB)

    Ulmer, Craig D.; Bayer, Gregory W.; Choe, Yung Ryn; Roe, Diana C.

    2010-09-01

    As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated analysis algorithms in the computing platforms' storage systems. Data Warehouse Appliances (DWAs) are attractive for this work, due to their ability to store and process massive datasets efficiently. While DWAs have been utilized effectively in data-mining and informatics applications, they remain largely unproven in scientific workloads. In this paper we present our experiences in adapting two mesh analysis algorithms to function on five different DWA architectures: two Netezza database appliances, an XtremeData dbX database, a LexisNexis DAS, and multiple Hadoop MapReduce clusters. The main contribution of this work is insight into the differences between these DWAs from a user's perspective. In addition, we present performance measurements for ten DWA systems to help understand the impact of different architectural trade-offs in these systems.

  4. Integrating Data Transformation in Principal Components Analysis

    KAUST Repository

    Maadooliat, Mehdi

    2015-01-02

    Principal component analysis (PCA) is a popular dimension reduction method to reduce the complexity and obtain the informative aspects of high-dimensional datasets. When the data distribution is skewed, data transformation is commonly used prior to applying PCA. Such transformation is usually obtained from previous studies, prior knowledge, or trial-and-error. In this work, we develop a model-based method that integrates data transformation in PCA and finds an appropriate data transformation using the maximum profile likelihood. Extensions of the method to handle functional data and missing values are also developed. Several numerical algorithms are provided for efficient computation. The proposed method is illustrated using simulated and real-world data examples.

  5. A Maturity Analysis of Big Data Technologies

    Directory of Open Access Journals (Sweden)

    Radu BONCEA

    2017-01-01

    Full Text Available In recent years Big Data technologies have been developed at faster pace due to increase in demand from applications that generate and process vast amount of data. The Cloud Computing and the Internet of Things are the main drivers for developing enterprise solutions that support Business Intelligence which in turn, creates new opportunities and new business models. An enterprise can now collect data about its internal processes, process this data to gain new insights and business value and make better decisions. And this is the reason why Big Data is now seen as a vital component in any enterprise architecture. In this article the maturity of several Big Data technologies is put under analysis. For each technology there are several aspects considered, such as development status, market usage, licensing policies, availability for certifications, adoption, support for cloud computing and enterprise.

  6. The EADGENE Microarray Data Analysis Workshop

    DEFF Research Database (Denmark)

    de Koning, Dirk-Jan; Jaffrézic, Florence; Lund, Mogens Sandø

    2007-01-01

    Microarray analyses have become an important tool in animal genomics. While their use is becoming widespread, there is still a lot of ongoing research regarding the analysis of microarray data. In the context of a European Network of Excellence, 31 researchers representing 14 research groups from...... 10 countries performed and discussed the statistical analyses of real and simulated 2-colour microarray data that were distributed among participants. The real data consisted of 48 microarrays from a disease challenge experiment in dairy cattle, while the simulated data consisted of 10 microarrays...... statistical weights, to omitting a large number of spots or omitting entire slides. Surprisingly, these very different approaches gave quite similar results when applied to the simulated data, although not all participating groups analysed both real and simulated data. The workshop was very successful...

  7. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  8. Spectral analysis of Floating Car Data

    OpenAIRE

    Gössel, F.; Michler, E.; Wrase, B.

    2003-01-01

    Floating Car Data (FCD) are one important data source in traffic telematic systems. The original variable in these systems is the vehicle velocity. The paper analyses the measured value “vehicle velocity" by methods of information technology. Consequences for processing, transmission and storage of FCD under condition of limited resources are discussed. Starting point of the investigation is the analysis of spectral characteristics of velocity-time-profiles. The spectra are determined by...

  9. Social Network Analysis Utilizing Big Data Technology

    OpenAIRE

    Magnusson, Jonathan

    2012-01-01

    As of late there has been an immense increase of data within modern society. This is evident within the field of telecommunications. The amount of mobile data is growing fast. For a telecommunication operator, this provides means of getting more information of specific subscribers. The applications of this are many, such as segmentation for marketing purposes or detection of churners, people about to switching operator. Thus the analysis and information extraction is of great value. An approa...

  10. Structural-Vibration-Response Data Analysis

    Science.gov (United States)

    Smith, W. R.; Hechenlaible, R. N.; Perez, R. C.

    1983-01-01

    Computer program developed as structural-vibration-response data analysis tool for use in dynamic testing of Space Shuttle. Program provides fast and efficient time-domain least-squares curve-fitting procedure for reducing transient response data to obtain structural model frequencies and dampings from free-decay records. Procedure simultaneously identifies frequencies, damping values, and participation factors for noisy multiple-response records.

  11. Analysis of HARP TPC krypton data

    CERN Document Server

    Dydak, F

    2004-01-01

    This memo describes the procedure which was adopted to equalize the response of the 3972 pads of the HARP TPC, using radioactive 83mKr gas. The results obtained from the study of reconstructed krypton clusters in the calibration data taken in 2002 are reported. Two complementary methods were employed in the data analysis. Compatible results were obtained for channel-to-channel equalization constants. An estimate of the overall systematic uncertainty was derived.

  12. IUE Data Analysis Software for Personal Computers

    Science.gov (United States)

    Thompson, R.; Caplinger, J.; Taylor, L.; Lawton , P.

    1996-01-01

    This report summarizes the work performed for the program titled, "IUE Data Analysis Software for Personal Computers" awarded under Astrophysics Data Program NRA 92-OSSA-15. The work performed was completed over a 2-year period starting in April 1994. As a result of the project, 450 IDL routines and eight database tables are now available for distribution for Power Macintosh computers and Personal Computers running Windows 3.1.

  13. An Atomic Data and Analysis Structure

    International Nuclear Information System (INIS)

    Summers, Hugh P.

    2000-01-01

    The Atomic Data and Analysis Structure (ADAS) Project is a shared activity of a world-wide consortium of fusion and astrophysical laboratories directed at developing and maintaining a common approach to analysing and modelling the radiating properties of plasmas. The origin and objectives of ADAS and the organization of its codes and data collections outlined. Current special projects in the ADAS Project work-plans are listed and an illustration given of ADAS at work. (author)

  14. Vapor Pressure Data Analysis and Statistics

    Science.gov (United States)

    2016-12-01

    near 8, 2000, and 200, respectively. The A (or a) value is directly related to vapor pressure and will be greater for high vapor pressure materials...1, (10) where n is the number of data points, Yi is the natural logarithm of the i th experimental vapor pressure value, and Xi is the...VAPOR PRESSURE DATA ANALYSIS AND STATISTICS ECBC-TR-1422 Ann Brozena RESEARCH AND TECHNOLOGY DIRECTORATE

  15. The Analysis of Polyploid Genetic Data.

    Science.gov (United States)

    Meirmans, Patrick G; Liu, Shenglin; van Tienderen, Peter H

    2018-03-16

    Though polyploidy is an important aspect of the evolutionary genetics of both plants and animals, the development of population genetic theory of polyploids has seriously lagged behind that of diploids. This is unfortunate since the analysis of polyploid genetic data-and the interpretation of the results-requires even more scrutiny than with diploid data. This is because of several polyploidy-specific complications in segregation and genotyping such as tetrasomy, double reduction, and missing dosage information. Here, we review the theoretical and statistical aspects of the population genetics of polyploids. We discuss several widely used types of inferences, including genetic diversity, Hardy-Weinberg equilibrium, population differentiation, genetic distance, and detecting population structure. For each, we point out how the statistical approach, expected result, and interpretation differ between different ploidy levels. We also discuss for each type of inference what biases may arise from the polyploid-specific complications and how these biases can be overcome. From our overview, it is clear that the statistical toolbox that is available for the analysis of genetic data is flexible and still expanding. Modern sequencing techniques will soon be able to overcome some of the current limitations to the analysis of polyploid data, though the techniques are lagging behind those available for diploids. Furthermore, the availability of more data may aggravate the biases that can arise, and increase the risk of false inferences. Therefore, simulations such as we used throughout this review are an important tool to verify the results of analyses of polyploid genetic data.

  16. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  17. ACCURACY ANALYSIS OF KINECT DEPTH DATA

    Directory of Open Access Journals (Sweden)

    K. Khoshelham

    2012-09-01

    Full Text Available This paper presents an investigation of the geometric quality of depth data obtained by the Kinect sensor. Based on the mathematical model of depth measurement by the sensor a theoretical error analysis is presented, which provides an insight into the factors influencing the accuracy of the data. Experimental results show that the random error of depth measurement increases with increasing distance to the sensor, and ranges from a few millimetres up to about 4 cm at the maximum range of the sensor. The accuracy of the data is also found to be influenced by the low resolution of the depth measurements.

  18. Atmospheric Data Package for the Composite Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Napier, Bruce A.; Ramsdell, James V.

    2005-09-01

    The purpose of this data package is to summarize our conceptual understanding of atmospheric transport and deposition, describe how this understanding will be simplified for numerical simulation as part of the Composite Analysis (i.e., implementation model), and finally to provide the input parameters needed for the simulations.

  19. Analysis of facility-monitoring data

    Energy Technology Data Exchange (ETDEWEB)

    Howell, J.A.

    1996-09-01

    This paper discusses techniques for analysis of data collected from nuclear-safeguards facility-monitoring systems. These methods can process information gathered from sensors and make interpretations that are in the best interests of the facility or agency, thereby enhancing safeguards while shortening inspection time.

  20. Term Structure Analysis with Big Data

    DEFF Research Database (Denmark)

    Andreasen, Martin Møller; Christensen, Jens H.E.; Rudebusch, Glenn D.

    Analysis of the term structure of interest rates almost always takes a two-step approach. First, actual bond prices are summarized by interpolated synthetic zero-coupon yields, and second, a small set of these yields are used as the source data for further empirical examination. In contrast, we...

  1. Trends in physics data analysis algorithms

    International Nuclear Information System (INIS)

    Denby, B.

    2004-01-01

    The paper provides a new look at algorithmic trends in modern physics experiments. Based on recently presented material, it attempts to draw conclusions in order to form a coherent historical picture of the past, present, and possible future of the field of data analysis techniques in physics. The importance of cross disciplinary approaches is stressed

  2. Representative Sampling for reliable data analysis

    DEFF Research Database (Denmark)

    Petersen, Lars; Esbensen, Kim Harry

    2005-01-01

    regime in order to secure the necessary reliability of: samples (which must be representative, from the primary sampling onwards), analysis (which will not mean anything outside the miniscule analytical volume without representativity ruling all mass reductions involved, also in the laboratory) and data...

  3. Fundamentals of quantitative PET data analysis

    NARCIS (Netherlands)

    Willemsen, ATM; van den Hoff, J

    2002-01-01

    Drug analysis and development with PET should fully exhaust the ability of this tomographic technique to quantify regional tracer concentrations in vivo. Data evaluation based on visual inspection or assessment of regional image contrast is not sufficient for this purpose since much of the

  4. Analysis of MUF data using arima models

    International Nuclear Information System (INIS)

    Downing, D.J.; Pike, D.H.; Morrison, G.W.

    1978-01-01

    An introduction to Box-Jenkins time series analysis is presented. It is shown how the models presented by Box-Jenkins can be applied to material unaccounted for (MUF) data to detect losses. For the constant loss case an optimal estimate of the loss is found and its probability of detection found

  5. Data Analysis for the LISA Pathfinder Mission

    Science.gov (United States)

    Thorpe, James Ira

    2009-01-01

    The LTP (LISA Technology Package) is the core part of the Laser Interferometer Space Antenna (LISA) Pathfinder mission. The main goal of the mission is to study the sources of any disturbances that perturb the motion of the freely-falling test masses from their geodesic trajectories as well as 10 test various technologies needed for LISA. The LTP experiment is designed as a sequence of experimental runs in which the performance of the instrument is studied and characterized under different operating conditions. In order to best optimize subsequent experimental runs, each run must be promptly analysed to ensure that the following ones make best use of the available knowledge of the instrument ' In order to do this, all analyses must be designed and tested in advance of the mission and have sufficient built-in flexibility to account for unexpected results or behaviour. To support this activity, a robust and flexible data analysis software package is also required. This poster presents two of the main components that make up the data analysis effort: the data analysis software and the mock-data challenges used to validate analysis procedures and experiment designs.

  6. Analysis of metabolomics data from twin families

    NARCIS (Netherlands)

    Draisma, Hermanus Henricus Maria

    2011-01-01

    Metabolomics is the comprehensive analysis of small molecules involved in metabolism, on the basis of samples that have been obtained from organisms in a given physiological state. Data obtained from measurements of trait levels in twin families can be used to elucidate the importance of genetic and

  7. XAFS Spectroscopy : Fundamental Principles and Data Analysis

    NARCIS (Netherlands)

    Koningsberger, D.C.; Mojet, B.L.; Dorssen, G.E. van; Ramaker, D.E.

    2000-01-01

    The physical principles of XAFS spectroscopy are given at a sufficiently basic level to enable scientists working in the field of catalysis to critically evaluate articles dealing with XAFS studies on catalytic materials. The described data-analysis methods provide the basic tools for studying the

  8. Reporting Data with "Over-the-Counter" Data Analysis Supports Increases Educators' Analysis Accuracy

    Science.gov (United States)

    Rankin, Jenny Grant

    2013-01-01

    There is extensive research on the benefits of making data-informed decisions to improve learning, but these benefits rely on the data being effectively interpreted. Despite educators' above-average intellect and education levels, there is evidence many educators routinely misinterpret student data. Data analysis problems persist even at districts…

  9. Data Extraction Based on Page Structure Analysis

    Directory of Open Access Journals (Sweden)

    Ren Yichao

    2017-01-01

    Full Text Available The information we need has some confusing problems such as dispersion and different organizational structure. In addition, because of the existence of unstructured data like natural language and images, extracting local content pages is extremely difficult. In the light of of the problems above, this article will apply a method combined with page structure analysis algorithm and page data extraction algorithm to accomplish the gathering of network data. In this way, the problem that traditional complex extraction model behave poorly when dealing with large-scale data is perfectly solved and the page data extraction efficiency is also boosted to a new level. In the meantime, the article will also make a comparison about pages and content of different types between the methods of DOM structure based on the page and HTML regularities of distribution. After all of those, we may find a more efficient extract method.

  10. Data analysis and interpretation for environmental surveillance

    International Nuclear Information System (INIS)

    1992-06-01

    The Data Analysis and Interpretation for Environmental Surveillance Conference was held in Lexington, Kentucky, February 5--7, 1990. The conference was sponsored by what is now the Office of Environmental Compliance and Documentation, Oak Ridge National Laboratory. Participants included technical professionals from all Martin Marietta Energy Systems facilities, Westinghouse Materials Company of Ohio, Pacific Northwest Laboratory, and several technical support contractors. Presentations at the conference ranged the full spectrum of issues that effect the analysis and interpretation of environmental data. Topics included tracking systems for samples and schedules associated with ongoing programs; coalescing data from a variety of sources and pedigrees into integrated data bases; methods for evaluating the quality of environmental data through empirical estimates of parameters such as charge balance, pH, and specific conductance; statistical applications to the interpretation of environmental information; and uses of environmental information in risk and dose assessments. Hearing about and discussing this wide variety of topics provided an opportunity to capture the subtlety of each discipline and to appreciate the continuity that is required among the disciplines in order to perform high-quality environmental information analysis

  11. Data Analysis Strategies in Medical Imaging.

    Science.gov (United States)

    Parmar, Chintan; Barry, Joseph D; Hosny, Ahmed; Quackenbush, John; Aerts, Hugo Jwl

    2018-03-26

    Radiographic imaging continues to be one of the most effective and clinically useful tools within oncology. Sophistication of artificial intelligence (AI) has allowed for detailed quantification of radiographic characteristics of tissues using predefined engineered algorithms or deep learning methods. Precedents in radiology as well as a wealth of research studies hint at the clinical relevance of these characteristics. However, there are critical challenges associated with the analysis of medical imaging data. While some of these challenges are specific to the imaging field, many others like reproducibility and batch effects are generic and have already been addressed in other quantitative fields such as genomics. Here, we identify these pitfalls and provide recommendations for analysis strategies of medical imaging data including data normalization, development of robust models, and rigorous statistical analyses. Adhering to these recommendations will not only improve analysis quality, but will also enhance precision medicine by allowing better integration of imaging data with other biomedical data sources. Copyright ©2018, American Association for Cancer Research.

  12. User analysis of LHCb data with Ganga

    CERN Document Server

    Maier, A; Cowan, G; Egede, U; Elmsheuser, J; Gaidioz, B; Harrison, K; Lee, H -C; Liko, D; Moscicki, J; Muraru, A; Pajchel, K; Reece, W; Samset, B; Slater, M; Soroko, A; van der Ster, D; Williams, M; Tan, C L

    2010-01-01

    GANGA (http://cern.ch/ganga) is a job-management tool that offers a simple, efficient and consistent user analysis tool in a variety of heterogeneous environments: from local clusters to global Grid systems. Experiment specific plug-ins allow GANGA to be customised for each experiment. For LHCb users GANGA is the officially supported and advertised tool for job submission to the Grid. The LHCb specific plug-ins allow support for end-to-end analysis helping the user to perform his complete analysis with the help of GANGA. This starts with the support for data selection, where a user can select data sets from the LHCb Bookkeeping system. Next comes the set up for large analysis jobs: with tailored plug-ins for the LHCb core software, jobs can be managed by the splitting of these analysis jobs with the subsequent merging of the resulting files. Furthermore, GANGA offers support for Toy Monte-Carlos to help the user tune their analysis. In addition to describing the GANGA architecture, typical usage patterns with...

  13. User analysis of LHCb data with Ganga

    International Nuclear Information System (INIS)

    Maier, Andrew; Gaidioz, Benjamin; Moscicki, Jakub; Muraru, Adrian; Ster, Daniel van der; Brochu, Frederic; Cowan, Greg; Egede, Ulrik; Reece, Will; Williams, Mike; Elmsheuser, Johannes; Harrison, Karl; Slater, Mark; Tan, Chun Lik; Lee, Hurng-Chun; Liko, Dietrich; Pajchel, Katarina; Samset, Bjoern; Soroko, Alexander

    2010-01-01

    GANGA (http://cern.ch/ganga) is a job-management tool that offers a simple, efficient and consistent user analysis tool in a variety of heterogeneous environments: from local clusters to global Grid systems. Experiment specific plug-ins allow GANGA to be customised for each experiment. For LHCb users GANGA is the officially supported and advertised tool for job submission to the Grid. The LHCb specific plug-ins allow support for end-to-end analysis helping the user to perform his complete analysis with the help of GANGA. This starts with the support for data selection, where a user can select data sets from the LHCb Bookkeeping system. Next comes the set up for large analysis jobs: with tailored plug-ins for the LHCb core software, jobs can be managed by the splitting of these analysis jobs with the subsequent merging of the resulting files. Furthermore, GANGA offers support for Toy Monte-Carlos to help the user tune their analysis. In addition to describing the GANGA architecture, typical usage patterns within LHCb and experience with the updated LHCb DIRAC workload management system are presented.

  14. Truck Roll Stability Data Collection and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Stevens, SS

    2001-07-02

    The principal objective of this project was to collect and analyze vehicle and highway data that are relevant to the problem of truck rollover crashes, and in particular to the subset of rollover crashes that are caused by the driver error of entering a curve at a speed too great to allow safe completion of the turn. The data are of two sorts--vehicle dynamic performance data, and highway geometry data as revealed by vehicle behavior in normal driving. Vehicle dynamic performance data are relevant because the roll stability of a tractor trailer depends both on inherent physical characteristics of the vehicle and on the weight and distribution of the particular cargo that is being carried. Highway geometric data are relevant because the set of crashes of primary interest to this study are caused by lateral acceleration demand in a curve that exceeds the instantaneous roll stability of the vehicle. An analysis of data quality requires an evaluation of the equipment used to collect the data because the reliability and accuracy of both the equipment and the data could profoundly affect the safety of the driver and other highway users. Therefore, a concomitant objective was an evaluation of the performance of the set of data-collection equipment on the truck and trailer. The objective concerning evaluation of the equipment was accomplished, but the results were not entirely positive. Significant engineering apparently remains to be done before a reliable system can be fielded. Problems were identified with the trailer to tractor fiber optic connector used for this test. In an over-the-road environment, the communication between the trailer instrumentation and the tractor must be dependable. In addition, the computer in the truck must be able to withstand the rigors of the road. The major objective--data collection and analysis--was also accomplished. Using data collected by instruments on the truck, a ''bad-curve'' database can be generated. Using

  15. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2013-01-01

    Statistics and Analysis of Scientific Data covers the foundations of probability theory and statistics, and a number of numerical and analytical methods that are essential for the present-day analyst of scientific data. Topics covered include probability theory, distribution functions of statistics, fits to two-dimensional datasheets and parameter estimation, Monte Carlo methods and Markov chains. Equal attention is paid to the theory and its practical application, and results from classic experiments in various fields are used to illustrate the importance of statistics in the analysis of scientific data. The main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and proactive use of the material for practical applications. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is us...

  16. Parallel interactive data analysis with PROOF

    International Nuclear Information System (INIS)

    Ballintijn, Maarten; Biskup, Marek; Brun, Rene; Canal, Philippe; Feichtinger, Derek; Ganis, Gerardo; Kickinger, Guenter; Peters, Andreas; Rademakers, Fons

    2006-01-01

    The Parallel ROOT Facility, PROOF, enables the analysis of much larger data sets on a shorter time scale. It exploits the inherent parallelism in data of uncorrelated events via a multi-tier architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. The system provides transparent and interactive access to gigabytes today. Being part of the ROOT framework PROOF inherits the benefits of a performant object storage system and a wealth of statistical and visualization tools. This paper describes the data analysis model of ROOT and the latest developments on closer integration of PROOF into that model and the ROOT user environment, e.g. support for PROOF-based browsing of trees stored remotely, and the popular TTree::Draw() interface. We also outline the ongoing developments aimed to improve the flexibility and user-friendliness of the system

  17. Qualitative data analysis a methods sourcebook

    CERN Document Server

    Miles, Matthew B; Saldana, Johnny

    2014-01-01

    The Third Edition of Miles & Huberman's classic research methods text is updated and streamlined by Johnny SaldaNa, author of The Coding Manual for Qualitative Researchers. Several of the data display strategies from previous editions are now presented in re-envisioned and reorganized formats to enhance reader accessibility and comprehension. The Third Edition's presentation of the fundamentals of research design and data management is followed by five distinct methods of analysis: exploring, describing, ordering, explaining, and predicting. Miles and Huberman's original research studies are profiled and accompanied with new examples from SaldaNa's recent qualitative work. The book's most celebrated chapter, "Drawing and Verifying Conclusions," is retained and revised, and the chapter on report writing has been greatly expanded, and is now called "Writing About Qualitative Research." Comprehensive and authoritative, Qualitative Data Analysis has been elegantly revised for a new generation of qualitative r...

  18. Integrative data analysis of male reproductive disorders

    DEFF Research Database (Denmark)

    Edsgard, Stefan Daniel

    of such data in conjunction with data from publicly available repositories. This thesis presents an introduction to disease genetics and molecular systems biology, followed by four studies that each provide detailed clues to the etiology of male reproductive disorders. Finally, a fifth study illustrates......-wide association data with respect to copy number variation and show that the aggregated effect of rare variants can influence the risk for testicular cancer. Paper V provides an example of the application of RNA-Seq for expression analysis of a species with an unsequenced genome. We analysed the plant...... of this thesis is the identification of the molecular basis of male reproductive disorders, with a special focus on testicular cancer. To this end, clinical samples were characterized by microarraybased transcription and genomic variation assays and molecular entities were identified by computational analysis...

  19. Statistics and analysis of scientific data

    CERN Document Server

    Bonamente, Massimiliano

    2017-01-01

    The revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: • a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. • a new chapter on the various measures of the mean including logarithmic averages. • new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. • a new case study and additional worked examples. • mathematical derivations and theoretical background material have been appropriately marked,to improve the readabili...

  20. The Measurand Framework: Scaling Exploratory Data Analysis

    Science.gov (United States)

    Schneider, D.; MacLean, L. S.; Kappler, K. N.; Bleier, T.

    2017-12-01

    Since 2005 QuakeFinder (QF) has acquired a unique dataset with outstanding spatial and temporal sampling of earth's time varying magnetic field along several active fault systems. This QF network consists of 124 stations in California and 45 stations along fault zones in Greece, Taiwan, Peru, Chile and Indonesia. Each station is equipped with three feedback induction magnetometers, two ion sensors, a 4 Hz geophone, a temperature sensor, and a humidity sensor. Data are continuously recorded at 50 Hz with GPS timing and transmitted daily to the QF data center in California for analysis. QF is attempting to detect and characterize anomalous EM activity occurring ahead of earthquakes. In order to analyze this sizable dataset, QF has developed an analytical framework to support processing the time series input data and hypothesis testing to evaluate the statistical significance of potential precursory signals. The framework was developed with a need to support legacy, in-house processing but with an eye towards big-data processing with Apache Spark and other modern big data technologies. In this presentation, we describe our framework, which supports rapid experimentation and iteration of candidate signal processing techniques via modular data transformation stages, tracking of provenance, and automatic re-computation of downstream data when upstream data is updated. Furthermore, we discuss how the processing modules can be ported to big data platforms like Apache Spark and demonstrate a migration path from local, in-house processing to cloud-friendly processing.

  1. Data analysis and source modelling for LISA

    International Nuclear Information System (INIS)

    Shang, Yu

    2014-01-01

    The gravitational waves are one of the most important predictions in general relativity. Besides of the directly proof of the existence of GWs, there are already several ground based detectors (such as LIGO, GEO, etc) and the planed future space mission (such as: LISA) which are aim to detect the GWs directly. GW contain a large amount of information of its source, extracting these information can help us dig out the physical property of the source, even open a new window for understanding the Universe. Hence, GW data analysis will be a challenging task in seeking the GWs. In this thesis, I present two works about the data analysis for LISA. In the first work, we introduce an extended multimodal genetic algorithm which utilizes the properties of the signal and the detector response function to analyze the data from the third round of mock LISA data challenge. We have found all five sources present in the data and recovered the coalescence time, chirp mass, mass ratio and sky location with reasonable accuracy. As for the orbital angular momentum and two spins of the Black Holes, we have found a large number of widely separated modes in the parameter space with similar maximum likelihood values. The performance of this method is comparable, if not better, to already existing algorithms. In the second work, we introduce an new phenomenological waveform model for the extreme mass ratio inspiral system. This waveform consists of a set of harmonics with constant amplitude and slowly evolving phase which we decompose in a Taylor series. We use these phenomenological templates to detect the signal in the simulated data, and then, assuming a particular EMRI model, estimate the physical parameters of the binary with high precision. The results show that our phenomenological waveform is very feasible in the data analysis of EMRI signal.

  2. Conducting Qualitative Data Analysis: Qualitative Data Analysis as a Metaphoric Process

    Science.gov (United States)

    Chenail, Ronald J.

    2012-01-01

    In the second of a series of "how-to" essays on conducting qualitative data analysis, Ron Chenail argues the process can best be understood as a metaphoric process. From this orientation he suggests researchers follow Kenneth Burke's notion of metaphor and see qualitative data analysis as the analyst systematically considering the "this-ness" of…

  3. Language workbench user interfaces for data analysis

    Science.gov (United States)

    Benson, Victoria M.

    2015-01-01

    Biological data analysis is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Workflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with workflows, and that do not require specialized user interfaces. However, some types of analyses can benefit from custom user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that can be expressed more succinctly with specialized user interfaces. Here, we show how Language Workbench (LW) technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS) LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/). PMID:25755929

  4. Language workbench user interfaces for data analysis

    Directory of Open Access Journals (Sweden)

    Victoria M. Benson

    2015-02-01

    Full Text Available Biological data analysis is frequently performed with command line software. While this practice provides considerable flexibility for computationally savy individuals, such as investigators trained in bioinformatics, this also creates a barrier to the widespread use of data analysis software by investigators trained as biologists and/or clinicians. Workflow systems such as Galaxy and Taverna have been developed to try and provide generic user interfaces that can wrap command line analysis software. These solutions are useful for problems that can be solved with workflows, and that do not require specialized user interfaces. However, some types of analyses can benefit from custom user interfaces. For instance, developing biomarker models from high-throughput data is a type of analysis that can be expressed more succinctly with specialized user interfaces. Here, we show how Language Workbench (LW technology can be used to model the biomarker development and validation process. We developed a language that models the concepts of Dataset, Endpoint, Feature Selection Method and Classifier. These high-level language concepts map directly to abstractions that analysts who develop biomarker models are familiar with. We found that user interfaces developed in the Meta-Programming System (MPS LW provide convenient means to configure a biomarker development project, to train models and view the validation statistics. We discuss several advantages of developing user interfaces for data analysis with a LW, including increased interface consistency, portability and extension by language composition. The language developed during this experiment is distributed as an MPS plugin (available at http://campagnelab.org/software/bdval-for-mps/.

  5. Evaluation and analysis of nuclear resonance data

    International Nuclear Information System (INIS)

    Frohner, F.H.

    2000-01-01

    A probabilistic foundations of data evaluation are reviewed, with special emphasis on parameter estimation based on Bayes' theorem and a quadratic loss function, and on modern methods for the assignment of prior probabilities. The data reduction process leading from raw experimental data to evaluated computer files of nuclear reaction cross sections is outlined, with a discussion of systematic and statistical errors and their propagation and of the generalized least squares formalism including prior information and nonlinear theoretical models. It is explained how common errors induce correlations between data, what consequences they have for uncertainty propagation and sensitivity studies, and how evaluators can construct covariance matrices from the usual error information provided by experimentalists. New techniques for evaluation of inconsistent data are also presented. The general principles are then applied specifically to the analysis and evaluation of neutron resonance data in terms of theoretical models - R-matrix theory (and especially its practically used multi-level Breit-Wigner and Reich-Moore variants) in the resolved region, and resonance-averaged R-matrix theory (Hauser-Feshbach theory with width-fluctuation corrections) in the unresolved region. Complications arise because the measured transmission data, capture and fission yields, self-indication ratios and other observables are not yet the wanted cross sections. These are obtained only by means of parametrisation. The intervening effects - Doppler and resolution broadening, self-shielding, multiple scattering, backgrounds, sample impurities, energy-dependent detector efficiencies, inaccurate reference data etc - are therefore also discussed. (author)

  6. Accelerating Large Data Analysis By Exploiting Regularities

    Science.gov (United States)

    Moran, Patrick J.; Ellsworth, David

    2003-01-01

    We present techniques for discovering and exploiting regularity in large curvilinear data sets. The data can be based on a single mesh or a mesh composed of multiple submeshes (also known as zones). Multi-zone data are typical to Computational Fluid Dynamics (CFD) simulations. Regularities include axis-aligned rectilinear and cylindrical meshes as well as cases where one zone is equivalent to a rigid-body transformation of another. Our algorithms can also discover rigid-body motion of meshes in time-series data. Next, we describe a data model where we can utilize the results from the discovery process in order to accelerate large data visualizations. Where possible, we replace general curvilinear zones with rectilinear or cylindrical zones. In rigid-body motion cases we replace a time-series of meshes with a transformed mesh object where a reference mesh is dynamically transformed based on a given time value in order to satisfy geometry requests, on demand. The data model enables us to make these substitutions and dynamic transformations transparently with respect to the visualization algorithms. We present results with large data sets where we combine our mesh replacement and transformation techniques with out-of-core paging in order to achieve significant speed-ups in analysis.

  7. Combining triggers in HEP data analysis

    International Nuclear Information System (INIS)

    Lendermann, Victor; Herbst, Michael; Krueger, Katja; Schultz-Coulon, Hans-Christian; Stamen, Rainer; Haller, Johannes

    2009-01-01

    Modern high-energy physics experiments collect data using dedicated complex multi-level trigger systems which perform an online selection of potentially interesting events. In general, this selection suffers from inefficiencies. A further loss of statistics occurs when the rate of accepted events is artificially scaled down in order to meet bandwidth constraints. An offline analysis of the recorded data must correct for the resulting losses in order to determine the original statistics of the analysed data sample. This is particularly challenging when data samples recorded by several triggers are combined. In this paper we present methods for the calculation of the offline corrections and study their statistical performance. Implications on building and operating trigger systems are discussed. (orig.)

  8. Combining triggers in HEP data analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lendermann, Victor; Herbst, Michael; Krueger, Katja; Schultz-Coulon, Hans-Christian; Stamen, Rainer [Heidelberg Univ. (Germany). Kirchhoff-Institut fuer Physik; Haller, Johannes [Hamburg Univ. (Germany). Institut fuer Experimentalphysik

    2009-01-15

    Modern high-energy physics experiments collect data using dedicated complex multi-level trigger systems which perform an online selection of potentially interesting events. In general, this selection suffers from inefficiencies. A further loss of statistics occurs when the rate of accepted events is artificially scaled down in order to meet bandwidth constraints. An offline analysis of the recorded data must correct for the resulting losses in order to determine the original statistics of the analysed data sample. This is particularly challenging when data samples recorded by several triggers are combined. In this paper we present methods for the calculation of the offline corrections and study their statistical performance. Implications on building and operating trigger systems are discussed. (orig.)

  9. Large Scale EOF Analysis of Climate Data

    Science.gov (United States)

    Prabhat, M.; Gittens, A.; Kashinath, K.; Cavanaugh, N. R.; Mahoney, M.

    2016-12-01

    We present a distributed approach towards extracting EOFs from 3D climate data. We implement the method in Apache Spark, and process multi-TB sized datasets on O(1000-10,000) cores. We apply this method to latitude-weighted ocean temperature data from CSFR, a 2.2 terabyte-sized data set comprising ocean and subsurface reanalysis measurements collected at 41 levels in the ocean, at 6 hour intervals over 31 years. We extract the first 100 EOFs of this full data set and compare to the EOFs computed simply on the surface temperature field. Our analyses provide evidence of Kelvin and Rossy waves and components of large-scale modes of oscillation including the ENSO and PDO that are not visible in the usual SST EOFs. Further, they provide information on the the most influential parts of the ocean, such as the thermocline, that exist below the surface. Work is ongoing to understand the factors determining the depth-varying spatial patterns observed in the EOFs. We will experiment with weighting schemes to appropriately account for the differing depths of the observations. We also plan to apply the same distributed approach to analysis of analysis of 3D atmospheric climatic data sets, including multiple variables. Because the atmosphere changes on a quicker time-scale than the ocean, we expect that the results will demonstrate an even greater advantage to computing 3D EOFs in lieu of 2D EOFs.

  10. Point Information Gain and Multidimensional Data Analysis

    Directory of Open Access Journals (Sweden)

    Renata Rychtáriková

    2016-10-01

    Full Text Available We generalize the point information gain (PIG and derived quantities, i.e., point information gain entropy (PIE and point information gain entropy density (PIED, for the case of the Rényi entropy and simulate the behavior of PIG for typical distributions. We also use these methods for the analysis of multidimensional datasets. We demonstrate the main properties of PIE/PIED spectra for the real data with the examples of several images and discuss further possible utilizations in other fields of data processing.

  11. New space sensor and mesoscale data analysis

    Science.gov (United States)

    Hickey, John S.

    1987-01-01

    The developed Earth Science and Application Division (ESAD) system/software provides the research scientist with the following capabilities: an extensive data base management capibility to convert various experiment data types into a standard format; and interactive analysis and display package (AVE80); an interactive imaging/color graphics capability utilizing the Apple III and IBM PC workstations integrated into the ESAD computer system; and local and remote smart-terminal capability which provides color video, graphics, and Laserjet output. Recommendations for updating and enhancing the performance of the ESAD computer system are listed.

  12. Reaction kinetic analysis of reactor surveillance data

    Energy Technology Data Exchange (ETDEWEB)

    Yoshiie, T., E-mail: yoshiie@rri.kyoto-u.ac.jp [Research Reactor Institute, Kyoto University, Kumatori-cho, Sennan-gun, Osaka-fu 590-0494 (Japan); Kinomura, A. [Research Reactor Institute, Kyoto University, Kumatori-cho, Sennan-gun, Osaka-fu 590-0494 (Japan); Nagai, Y. [The Oarai Center, Institute for Materials Research, Tohoku University, Oarai, Ibaraki 311-1313 (Japan)

    2017-02-15

    In the reactor pressure vessel surveillance data of a European-type pressurized water reactor (low-Cu steel), it was found that the concentration of matrix defects was very high, and a large number of precipitates existed. In this study, defect structure evolution obtained from surveillance data was simulated by reaction kinetic analysis using 15 rate equations. The saturation of precipitation and the growth of loops were simulated, but it was not possible to explain the increase in DBTT on the basis of the defect structures. The sub-grain boundary segregation of solutes was discussed for the origin of the DBTT increase.

  13. Insider threat to secure facilities: data analysis

    International Nuclear Information System (INIS)

    1980-01-01

    Three data sets drawn from industries that have experienced internal security breaches are analyzed. The industries and the insider security breaches are considered analogous in one or more respects to insider threats potentially confronting managers in the nuclear industry. The three data sets are: bank fraud and embezzlement (BF and E), computer-related crime, and drug theft from drug manufacturers and distributors. A careful analysis by both descriptive and formal statistical techniques permits certain general conclusions on the internal threat to secure industries to be drawn. These conclusions are discussed and related to the potential insider threat in the nuclear industry. 49 tabs

  14. Population genetic analysis of ascertained SNP data

    Directory of Open Access Journals (Sweden)

    Nielsen Rasmus

    2004-03-01

    Full Text Available Abstract The large single nucleotide polymorphism (SNP typing projects have provided an invaluable data resource for human population geneticists. Almost all of the available SNP loci, however, have been identified through a SNP discovery protocol that will influence the allelic distributions in the sampled loci. Standard methods for population genetic analysis based on the available SNP data will, therefore, be biased. This paper discusses the effect of this ascertainment bias on allelic distributions and on methods for quantifying linkage disequilibrium and estimating demographic parameters. Several recently developed methods for correcting for the ascertainment bias will also be discussed.

  15. Metric learning for DNA microarray data analysis

    International Nuclear Information System (INIS)

    Takeuchi, Ichiro; Nakagawa, Masao; Seto, Masao

    2009-01-01

    In many microarray studies, gene set selection is an important preliminary step for subsequent main task such as tumor classification, cancer subtype identification, etc. In this paper, we investigate the possibility of using metric learning as an alternative to gene set selection. We develop a simple metric learning algorithm aiming to use it for microarray data analysis. Exploiting a property of the algorithm, we introduce a novel approach for extending the metric learning to be adaptive. We apply the algorithm to previously studied microarray data on malignant lymphoma subtype identification.

  16. Simulation data analysis by virtual reality system

    International Nuclear Information System (INIS)

    Ohtani, Hiroaki; Mizuguchi, Naoki; Shoji, Mamoru; Ishiguro, Seiji; Ohno, Nobuaki

    2010-01-01

    We introduce new software for analysis of time-varying simulation data and new approach for contribution of simulation to experiment by virtual reality (VR) technology. In the new software, the objects of time-varying field are visualized in VR space and the particle trajectories in the time-varying electromagnetic field are also traced. In the new approach, both simulation results and experimental device data are simultaneously visualized in VR space. These developments enhance the study of the phenomena in plasma physics and fusion plasmas. (author)

  17. Model Selection in Data Analysis Competitions

    DEFF Research Database (Denmark)

    Wind, David Kofoed; Winther, Ole

    2014-01-01

    The use of data analysis competitions for selecting the most appropriate model for a problem is a recent innovation in the field of predictive machine learning. Two of the most well-known examples of this trend was the Netflix Competition and recently the competitions hosted on the online platform...... performers from Kaggle and use previous personal experiences from competing in Kaggle competitions. The stated hypotheses about feature engineering, ensembling, overfitting, model complexity and evaluation metrics give indications and guidelines on how to select a proper model for performing well...... Kaggle. In this paper, we will state and try to verify a set of qualitative hypotheses about predictive modelling, both in general and in the scope of data analysis competitions. To verify our hypotheses we will look at previous competitions and their outcomes, use qualitative interviews with top...

  18. airborne data analysis/monitor system

    Science.gov (United States)

    Stephison, D. B.

    1981-01-01

    An Airborne Data Analysis/Monitor System (ADAMS), a ROLM 1666 computer based system installed onboard test airplanes used during experimental testing is evaluated. In addition to the 1666 computer, the ADAMS hardware includes a DDC System 90 fixed head disk and a Miltape DD400 floppy disk. Boeing designed a DMA interface to the data acquisition system and an intelligent terminal to reduce system overhead and simplify operator commands. The ADAMS software includes RMX/RTOS and both ROLM FORTRAN and assembly language are used. The ADAMS provides real time displays that enable onboard test engineers to make rapid decisions about test conduct thus reducing the cost and time required to certify new model airplanes, and improved the quality of data derived from the test, leading to more rapid development of improvements resulting in quieter, safer, and more efficient airplanes. The availability of airborne data processing removes most of the weather and geographical restrictions imposed by telemetered flight test data systems. A data base is maintained to describe the airplane, the data acquisition system, the type of testing, and the conditions under which the test is performed.

  19. Artificial Intelligence techniques for big data analysis

    OpenAIRE

    Aditya Khatri

    2017-01-01

    During my stay in Salamanca (Spain), I was fortunate enough to participate in the BISITE Research Group of the University of Salamanca. The University of Salamanca is the oldest university in Spain and in 2018 it celebrates its 8th centenary. As a computer science researcher, I participated in one of the many international projects that the research group has active, especially in big data analysis using Artificial Intelligence (AI) techniques. AI is one of BISITE's main lines of rese...

  20. The CMS Data Analysis School Experience

    Energy Technology Data Exchange (ETDEWEB)

    De Filippis, N. [INFN, Bari; Bauerdick, L. [Fermilab; Chen, J. [Taiwan, Natl. Taiwan U.; Gallo, E. [DESY; Klima, B. [Fermilab; Malik, S. [Puerto Rico U., Mayaguez; Mulders, M. [CERN; Palla, F. [INFN, Pisa; Rolandi, G. [Pisa, Scuola Normale Superiore

    2017-11-21

    The CMS Data Analysis School is an official event organized by the CMS Collaboration to teach students and post-docs how to perform a physics analysis. The school is coordinated by the CMS schools committee and was first implemented at the LHC Physics Center at Fermilab in 2010. As part of the training, there are a number of “short” exercises on physics object reconstruction and identification, Monte Carlo simulation, and statistical analysis, which are followed by “long” exercises based on physics analyses. Some of the long exercises go beyond the current state of the art of the corresponding CMS analyses. This paper describes the goals of the school, the preparations for a school, the structure of the training, and student satisfaction with the experience as measured by surveys.

  1. Integral data analysis for resonance parameters determination

    International Nuclear Information System (INIS)

    Larson, N.M.; Leal, L.C.; Derrien, H.

    1997-09-01

    Neutron time-of-flight experiments have long been used to determine resonance parameters. Those resonance parameters have then been used in calculations of integral quantities such as Maxwellian averages or resonance integrals, and results of those calculations in turn have been used as a criterion for acceptability of the resonance analysis. However, the calculations were inadequate because covariances on the parameter values were not included in the calculations. In this report an effort to correct for that deficiency is documented: (1) the R-matrix analysis code SAMMY has been modified to include integral quantities of importance, (2) directly within the resonance parameter analysis, and (3) to determine the best fit to both differential (microscopic) and integral (macroscopic) data simultaneously. This modification was implemented because it is expected to have an impact on the intermediate-energy range that is important for criticality safety applications

  2. Bayesian data analysis tools for atomic physics

    Science.gov (United States)

    Trassinelli, Martino

    2017-10-01

    We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes' theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested_fit to calculate the different probability distributions and other related quantities. Nested_fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.

  3. Direct analysis of quantal radiation response data

    International Nuclear Information System (INIS)

    Thames, H.D. Jr.; Rozell, M.E.; Tucker, S.L.; Ang, K.K.; Travis, E.L.; Fisher, D.R.

    1986-01-01

    A direct analysis is proposed for quantal (all-or-nothing) responses to fractionated radiation and endpoint-dilution assays of cell survival. As opposed to two-step methods such as the reciprocal-dose technique, in which ED 50 values are first estimated for different fractionation schemes and then fit (as reciprocals) against dose per fraction, all raw data are included in a single maximum-likelihood treatment. The method accommodates variations such as short-interval fractionation regimens designed to determine tissue repair kinetics, tissue response to continuous exposures, and data obtained using endpoint-dilution assays of cell survival after fractionated doses. Monte-Carlo techniques were used to compare the direct and reciprocal-dose methods for analysis of small-scale and large-scale studies of response to fractionated doses. Both methods tended toward biased estimates in the analysis of small-scale (3 fraction numbers) studies. The α/β ratios showed less scatter when estimated by the direct method. The 95% confidence intervals determined by the direct method were more appropriate than those determined by reciprocal-dose analysis, for which 18% (small-scale study) or 8% (large-scale study) of the confidence intervals did not include the 'true' value of α/β. (author)

  4. [Technologies for Complex Intelligent Clinical Data Analysis].

    Science.gov (United States)

    Baranov, A A; Namazova-Baranova, L S; Smirnov, I V; Devyatkin, D A; Shelmanov, A O; Vishneva, E A; Antonova, E V; Smirnov, V I

    2016-01-01

    The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient's features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented. Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality. the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center. Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as "negation" (indicates that the disease is absent), "no patient" (indicates that the disease refers to the patient's family member, but not to the patient), "severity of illness", disease course", "body region to which the disease refers". Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the methodfor determining the most informative patients'features are

  5. Entropie analysis of floating car data systems

    Directory of Open Access Journals (Sweden)

    F. Gössel

    2004-01-01

    Full Text Available The knowledge of the actual traffic state is a basic prerequisite of modern traffic telematic systems. Floating Car Data (FCD systems are becoming more and more important for the provision of actual and reliable traffic data. In these systems the vehicle velocity is the original variable for the evaluation of the current traffic condition. As real FCDsystems are operating under conditions of limited transmission and processing capacity the analysis of the original variable vehicle speed is of special interest. Entropy considerations are especially useful for the deduction of fundamental restrictions and limitations. The paper analyses velocity-time profiles by means of information entropy. It emphasises in quantification of the information content of velocity-time profiles and the discussion of entropy dynamic in velocity-time profiles. Investigations are based on empirical data derived during field trials. The analysis of entropy dynamic is carried out in two different ways. On one hand velocity differences within a certain interval of time are used, on the other hand the transinformation between velocities in certain time distances was evaluated. One important result is an optimal sample-rate for the detection of velocity data in FCD-systems. The influence of spatial segmentation and of different states of traffic was discussed.

  6. Advances in Mössbauer data analysis

    Science.gov (United States)

    de Souza, Paulo A.

    1998-08-01

    The whole Mössbauer community generates a huge amount of data in several fields of human knowledge since the first publication of Rudolf Mössbauer. Interlaboratory measurements of the same substance may result in minor differences in the Mössbauer Parameters (MP) of isomer shift, quadrupole splitting and internal magnetic field. Therefore, a conventional data bank of published MP will be of limited help in identification of substances. Data bank search for exact information became incapable to differentiate the values of Mössbauer parameters within the experimental errors (e.g., IS = 0.22 mm/s from IS = 0.23 mm/s), but physically both values may be considered the same. An artificial neural network (ANN) is able to identify a substance and its crystalline structure from measured MP, and its slight variations do not represent an obstacle for the ANN identification. A barrier to the popularization of Mössbauer spectroscopy as an analytical technique is the absence of a full automated equipment, since the analysis of a Mössbauer spectrum normally is time-consuming and requires a specialist. In this work, the fitting process of a Mössbauer spectrum was completely automated through the use of genetic algorithms and fuzzy logic. Both software and hardware systems were implemented turning out to be a fully automated Mössbauer data analysis system. The developed system will be presented.

  7. Information Retrieval Using Hadoop Big Data Analysis

    Science.gov (United States)

    Motwani, Deepak; Madan, Madan Lal

    This paper concern on big data analysis which is the cognitive operation of probing huge amounts of information in an attempt to get uncovers unseen patterns. Through Big Data Analytics Applications such as public and private organization sectors have formed a strategic determination to turn big data into cut throat benefit. The primary occupation of extracting value from big data give rise to a process applied to pull information from multiple different sources; this process is known as extract transforms and lode. This paper approach extract information from log files and Research Paper, awareness reduces the efforts for blueprint finding and summarization of document from several positions. The work is able to understand better Hadoop basic concept and increase the user experience for research. In this paper, we propose an approach for analysis log files for finding concise information which is useful and time saving by using Hadoop. Our proposed approach will be applied on different research papers on a specific domain and applied for getting summarized content for further improvement and make the new content.

  8. ACES MWL data analysis center at SYRTE

    Science.gov (United States)

    Meynadier, F.; Delva, P.; le Poncin-Lafitte, C.; Guerlin, C.; Laurent, P.; Wolf, P.

    2017-12-01

    The ACES-PHARAO mission aims at operating a cold-atom caesium clock on board the International Space Station, and performs two-way time transfer with ground terminals, in order to allow highly accurate and stable comparisons of its internal timescale with those found in various metrology institutes. Scientific goals in fundamental physics include tests of the gravitational redshift with unprecedented accuracy, and search for a violation of the Lorentz local invariance. As launch is coming closer we are getting ready to process the data expected to come from ACES Microwave Link (MWL) once on board the International Space Station. Several hurdles have been cleared in our software in the past months, as we managed to implement algorithms that reach target accuracy for ground/space desynchronisation measurement. I will present the current status of data analysis preparation, as well as the activities that will take place at SYRTE in order to set up its data processing center.

  9. Mars Science Laboratory Heatshield Flight Data Analysis

    Science.gov (United States)

    Mahzari, Milad; White, Todd

    2017-01-01

    NASA Mars Science Laboratory (MSL), which landed the Curiosity rover on the surface of Mars on August 5th, 2012, was the largest and heaviest Mars entry vehicle representing a significant advancement in planetary entry, descent and landing capability. Hypersonic flight performance data was collected using MSLs on-board sensors called Mars Entry, Descent and Landing Instrumentation (MEDLI). This talk will give an overview of MSL entry and a description of MEDLI sensors. Observations from flight data will be examined followed by a discussion of analysis efforts to reconstruct surface heating from heatshields in-depth temperature measurements. Finally, a brief overview of MEDLI2 instrumentation, which will fly on NASAs Mars2020 mission, will be presented with a discussion on how lessons learned from MEDLI data affected the design of MEDLI2 instrumentation.

  10. Integrative analysis of metabolomics and transcriptomics data

    DEFF Research Database (Denmark)

    Brink-Jensen, Kasper; Bak, Søren; Jørgensen, Kirsten

    2013-01-01

    ) measurements from the same samples, to identify genes controlling the production of metabolites. Due to the high dimensionality of both LC-MS and DNA microarray data, dimension reduction and variable selection are key elements of the analysis. Our proposed approach starts by identifying the basis functions......The abundance of high-dimensional measurements in the form of gene expression and mass spectroscopy calls for models to elucidate the underlying biological system. For widely studied organisms like yeast, it is possible to incorporate prior knowledge from a variety of databases, an approach used...... ("building blocks") that constitute the output from a mass spectrometry experiment. Subsequently, the weights of these basis functions are related to the observations from the corresponding gene expression data in order to identify which genes are associated with specific patterns seen in the metabolite data...

  11. Statistical Analysis of Big Data on Pharmacogenomics

    Science.gov (United States)

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  12. Component fragilities - data collection, analysis and interpretation

    International Nuclear Information System (INIS)

    Bandyopadhyay, K.K.; Hofmayer, C.H.

    1986-01-01

    As part of the component fragility research program sponsored by the US Nuclear Regulatory Commission, BNL is involved in establishing seismic fragility levels for various nuclear power plant equipment with emphasis on electrical equipment, by identifying, collecting and analyzing existing test data from various sources. BNL has reviewed approximately seventy test reports to collect fragility or high level test data for switchgears, motor control centers and similar electrical cabinets, valve actuators and numerous electrical and control devices of various manufacturers and models. Through a cooperative agreement, BNL has also obtained test data from EPRI/ANCO. An analysis of the collected data reveals that fragility levels can best be described by a group of curves corresponding to various failure modes. The lower bound curve indicates the initiation of malfunctioning or structural damage, whereas the upper bound curve corresponds to overall failure of the equipment based on known failure modes occurring separately or interactively. For some components, the upper and lower bound fragility levels are observed to vary appreciably depending upon the manufacturers and models. An extensive amount of additional fragility or high level test data exists. If completely collected and properly analyzed, the entire data bank is expected to greatly reduce the need for additional testing to establish fragility levels for most equipment

  13. Validation of Fourier analysis of videokeratographic data.

    Science.gov (United States)

    Sideroudi, Haris; Labiris, Georgios; Ditzel, Fienke; Tsaragli, Efi; Georgatzoglou, Kimonas; Siganos, Haralampos; Kozobolis, Vassilios

    2017-06-15

    The aim was to assess the repeatability of Fourier transfom analysis of videokeratographic data using Pentacam in normal (CG), keratoconic (KC) and post-CXL (CXL) corneas. This was a prospective, clinic-based, observational study. One randomly selected eye from all study participants was included in the analysis: 62 normal eyes (CG group), 33 keratoconus eyes (KC group), while 34 eyes, which had already received CXL treatment, formed the CXL group. Fourier analysis of keratometric data were obtained using Pentacam, by two different operators within each of two sessions. Precision, repeatability and Intraclass Correlation Coefficient (ICC), were calculated for evaluating intrassesion and intersession repeatability for the following parameters: Spherical Component (SphRmin, SphEcc), Maximum Decentration (Max Dec), Regular Astigmatism, and Irregularitiy (Irr). Bland-Altman analysis was used for assessing interobserver repeatability. All parameters were presented to be repeatable, reliable and reproductible in all groups. Best intrasession and intersession repeatability and reliability were detected for parameters SphRmin, SphEcc and Max Dec parameters for both operators using ICC (intrasession: ICC > 98%, intersession: ICC > 94.7%) and within subject standard deviation. Best precision and lowest range of agreement was found for the SphRmin parameter (CG: 0.05, KC: 0.16, and CXL: 0.2) in all groups, while the lowest repeatability, reliability and reproducibility was detected for the Irr parameter. The Pentacam system provides accurate measurements of Fourier tranform keratometric data. A single Pentacam scan will be sufficient for most clinical applications.

  14. IAGA Geomagnetic Data Analysis format - Analysis_IAGA

    Science.gov (United States)

    -Emilian Toader, Victorin; Marmureanu, Alexandru

    2013-04-01

    Geomagnetic research involves a continuous Earth's magnetic field monitoring and software for processing large amounts of data. The Analysis_IAGA program reads and analyses files in IAGA2002 format used within the INTERMAGNET observer network. The data is made available by INTERMAGNET (http://www.intermagnet.org/Data_e.php) and NOAA - National Geophysical Data Center (ftp://ftp.ngdc.noaa.gov/wdc/geomagnetism/data/observatories/definitive) cost free for scientific use. The users of this software are those who study geomagnetism or use this data along with other atmospheric or seismic factors. Analysis_IAGA allows the visualization of files for the same station, with the feature of merging data for analyzing longer time intervals. Each file contains data collected within a 24 hour time interval with a sampling rate of 60 seconds or 1 second. Adding a large number of files may be done by dividing the sampling frequency. Also, the program has the feature of combining data files gathered from multiple stations as long as the sampling rate and time intervals are the same. Different channels may be selected, visualized and filtered individually. Channel properties can be saved and edited in a file. Data can be processed (spectral power, P / F, estimated frequency, Bz/Bx, Bz/By, convolutions and correlations on pairs of axis, discrete differentiation) and visualized along with the original signals on the same panel. With the help of cursors/magnifiers time differences can be calculated. Each channel can be analyzed separately. Signals can be filtered using bandpass, lowpass, highpass (Butterworth, Chebyshev, Inver Chebyshev, Eliptic, Bessel, Median, ZeroPath). Separate graphics visualize the spectral power, frequency spectrum histogram, the evolution of the estimated frequency, P/H, the spectral power. Adaptive JTFA spectrograms can be selected: CSD (Cone-Shaped Distribution), CWD (Choi-Williams Distribution), Gabor, STFT (short-time Fourier transform), WVD (Wigner

  15. Scientific data base management, time series analysis, and data display

    International Nuclear Information System (INIS)

    Malthan, J.A.; Burgess, D.N.

    1976-01-01

    Since the concluding of treaties banning testing of nuclear devices in the atmosphere, data necessary to continued development of nuclear weapons have in some cases been obtained in field tests in which high-yield chemical explosives were used in lieu of nuclear devices. In 1972 it was decided that a central file of raw data from such tests was necessary. The steps involved in assembling, organizing, and maintaining these data are described under the headings data archive, data directories, data identification system, data management system, and data processing. An example case to illustrate the four types of processing requests was shown. Types of data display are summarized. 7 figures, 4 tables

  16. Analysis of gravity data using trend surfaces

    Science.gov (United States)

    Asimopolos, Natalia-Silvia; Asimopolos, Laurentiu

    2013-04-01

    In this paper we have developed algorithms and related software programs for calculating of trend surfaces of higher order. These methods of analysis of trends, like mobile media applications are filtration systems for geophysical data in surface. In particular we presented few case studies for gravity data and gravity maps. Analysis with polynomial trend surfaces contributes to the recognition, isolation and measurement of trends that can be represented by surfaces or hyper-surfaces (in several sizes), thus achieving a separation in regional variations and local variations. This separation is achieved by adjusting the trend function at different values. Trend surfaces using the regression analysis satisfy the criterion of least squares. The difference between the surface of trend and the observed value in a certain point is the residual value. Residual sum of squares of these values should be minimal as the criterion of least squares. The trend surface is considered as regional or large-scale and the residual value will be regarded as local or small-scale component. Removing the regional trend has the effect of highlighting local components represented by residual values. Surface analysis and hyper-surfaces principles are applied to the surface trend and any number of dimensions. For hyper-surfaces we can work with polynomial functions with four or more variables (three variables of space and other variables for interest parameters) that have great importance in some applications. In the paper we presented the mathematical developments about generalized trend surfaces and case studies about gravimetric data. The trend surfaces have the great advantage that the effect of regional anomalies can be expressed as analytic functions. These tendency surfaces allows subsequent mathematical processing and interesting generalizations, with great advantage to work with polynomial functions compared with the original discrete data. For gravity data we estimate the depth of

  17. Data analysis and visualization in MCNP trademark

    International Nuclear Information System (INIS)

    Waters, L.S.

    1994-01-01

    There are many situations where the user may wish to go beyond current MCNP capabilities. For example, data produced by the code may need formatting for input into an external graphics package. Limitations on disk space may hinder writing out large PTRAK files. Specialized data analysis routines may be needed to model complex experimental results. One may wish to produce particle histories in a format not currently available in the code. To address these and other similar concerns a new capability in MCNP is being tested. A number of real, integer, logical and character variables describing the current and past characteristics of a particle are made available online to the user in three subroutines. The type of data passed can be controlled by cards in the INP file. The subroutines otherwise are empty, and the user may code in any desired analysis. A new MCNP executable is produced by compiling these subroutines and linking to a library which contains the object files for the rest of the code

  18. Flash Infrared Thermography Contrast Data Analysis Technique

    Science.gov (United States)

    Koshti, Ajay

    2014-01-01

    This paper provides information on an IR Contrast technique that involves extracting normalized contrast versus time evolutions from the flash thermography inspection infrared video data. The analysis calculates thermal measurement features from the contrast evolution. In addition, simulation of the contrast evolution is achieved through calibration on measured contrast evolutions from many flat-bottom holes in the subject material. The measurement features and the contrast simulation are used to evaluate flash thermography data in order to characterize delamination-like anomalies. The thermal measurement features relate to the anomaly characteristics. The contrast evolution simulation is matched to the measured contrast evolution over an anomaly to provide an assessment of the anomaly depth and width which correspond to the depth and diameter of the equivalent flat-bottom hole (EFBH) similar to that used as input to the simulation. A similar analysis, in terms of diameter and depth of an equivalent uniform gap (EUG) providing a best match with the measured contrast evolution, is also provided. An edge detection technique called the half-max is used to measure width and length of the anomaly. Results of the half-max width and the EFBH/EUG diameter are compared to evaluate the anomaly. The information provided here is geared towards explaining the IR Contrast technique. Results from a limited amount of validation data on reinforced carbon-carbon (RCC) hardware are included in this paper.

  19. Detector Simulation: Data Treatment and Analysis Methods

    CERN Document Server

    Apostolakis, J

    2011-01-01

    Detector Simulation in 'Data Treatment and Analysis Methods', part of 'Landolt-Börnstein - Group I Elementary Particles, Nuclei and Atoms: Numerical Data and Functional Relationships in Science and Technology, Volume 21B1: Detectors for Particles and Radiation. Part 1: Principles and Methods'. This document is part of Part 1 'Principles and Methods' of Subvolume B 'Detectors for Particles and Radiation' of Volume 21 'Elementary Particles' of Landolt-Börnstein - Group I 'Elementary Particles, Nuclei and Atoms'. It contains the Section '4.1 Detector Simulation' of Chapter '4 Data Treatment and Analysis Methods' with the content: 4.1 Detector Simulation 4.1.1 Overview of simulation 4.1.1.1 Uses of detector simulation 4.1.2 Stages and types of simulation 4.1.2.1 Tools for event generation and detector simulation 4.1.2.2 Level of simulation and computation time 4.1.2.3 Radiation effects and background studies 4.1.3 Components of detector simulation 4.1.3.1 Geometry modeling 4.1.3.2 External fields 4.1.3.3 Intro...

  20. General analysis of HERA II data

    International Nuclear Information System (INIS)

    Schoening, A

    2008-01-01

    A model-independent search for deviations from the Standard Model prediction is performed in e ± p collisions. Data collected in the years 2003-2007 corresponding to an integrated luminosity of about 340 pb -1 are analyzed. All event topologies involving isolated electrons, photons, muons, neutrinos and jets with high transverse momenta are investigated in a single analysis. Events are assigned to exclusive classes according to their final state. A statistical algorithm is applied to search for deviations from the Standard Model in the distributions of the scalar sum of transverse momenta or invariant mass of final state particles and to quantify their significance. A good agreement with the Standard Model prediction is observed in most of the event classes. No significant deviation is observed in the phase-space and in the event topologies covered by this analysis

  1. Multivariate analysis of data in sensory science

    CERN Document Server

    Naes, T; Risvik, E

    1996-01-01

    The state-of-the-art of multivariate analysis in sensory science is described in this volume. Both methods for aggregated and individual sensory profiles are discussed. Processes and results are presented in such a way that they can be understood not only by statisticians but also by experienced sensory panel leaders and users of sensory analysis. The techniques presented are focused on examples and interpretation rather than on the technical aspects, with an emphasis on new and important methods which are possibly not so well known to scientists in the field. Important features of the book are discussions on the relationship among the methods with a strong accent on the connection between problems and methods. All procedures presented are described in relation to sensory data and not as completely general statistical techniques. Sensory scientists, applied statisticians, chemometricians, those working in consumer science, food scientists and agronomers will find this book of value.

  2. XML-based analysis interface for particle physics data analysis

    International Nuclear Information System (INIS)

    Hu Jifeng; Lu Xiaorui; Zhang Yangheng

    2011-01-01

    The letter emphasizes on an XML-based interface and its framework for particle physics data analysis. The interface uses a concise XML syntax to describe, in data analysis, the basic tasks: event-selection, kinematic fitting, particle identification, etc. and a basic processing logic: the next step goes on if and only if this step succeeds. The framework can perform an analysis without compiling by loading the XML-interface file, setting p in run-time and running dynamically. An analysis coding in XML instead of C++, easy-to-understood arid use, effectively reduces the work load, and enables users to carry out their analyses quickly. The framework has been developed on the BESⅢ offline software system (BOSS) with the object-oriented C++ programming. These functions, required by the regular tasks and the basic processing logic, are implemented with both standard modules or inherited from the modules in BOSS. The interface and its framework have been tested to perform physics analysis. (authors)

  3. Experimental data base for containment thermalhydraulic analysis

    International Nuclear Information System (INIS)

    Cheng, X.; Bazin, P.; Cornet, P.; Hittner, D.; Jackson, J.D.; Lopez Jimenez, J.; Naviglio, A.; Oriolo, F.; Petzold, H.

    2001-01-01

    This paper describes the joint research project DABASCO which is supported by the European Community under a cost-shared contract and participated by nine European institutions. The main objective of the project is to provide a generic experimental data base for the development of physical models and correlations for containment thermalhydraulic analysis. The project consists of seven separate-effects experimental programs which deal with new innovative conceptual features, e.g. passive decay heat removal and spray systems. The results of the various stages of the test programs will be assessed by industrial partners in relation to their applicability to reactor conditions

  4. The analysis of powder diffraction data

    International Nuclear Information System (INIS)

    David, W.I.F.; Harrison, W.T.A.

    1986-01-01

    The paper reviews neutron powder diffraction data analysis, with emphasis on the structural aspects of powder diffraction and the future possibilities afforded by the latest generation of very high resolution neutron and x-ray powder diffractometers. Traditional x-ray powder diffraction techniques are outlined. Structural studies by powder diffraction are discussed with respect to the Rietveld method, and a case study in the Rietveld refinement method and developments of the Rietveld method are described. Finally studies using high resolution powder diffraction at the Spallation Neutron Source, ISIS at the Rutherford Appleton Laboratory are summarized. (U.K.)

  5. Maximum entropy analysis of EGRET data

    DEFF Research Database (Denmark)

    Pohl, M.; Strong, A.W.

    1997-01-01

    EGRET data are usually analysed on the basis of the Maximum-Likelihood method \\cite{ma96} in a search for point sources in excess to a model for the background radiation (e.g. \\cite{hu97}). This method depends strongly on the quality of the background model, and thus may have high systematic unce...... uncertainties in region of strong and uncertain background like the Galactic Center region. Here we show images of such regions obtained by the quantified Maximum-Entropy method. We also discuss a possible further use of MEM in the analysis of problematic regions of the sky....

  6. Hadron polarizability data analysis: GoAT

    Energy Technology Data Exchange (ETDEWEB)

    Stegen, H., E-mail: hkstegen@mta.ca; Hornidge, D. [Mount Allison University, Sackville (Canada); Collicott, C. [Dalhousie University, Halifax (Canada); Martel, P. [Mount Allison University, Sackville (Canada); Johannes Gutenberg University, Mainz (Germany); Ott, P. [Johannes Gutenberg University, Mainz (Germany)

    2015-12-31

    The A2 Collaboration at the Institute for Nuclear Physics in Mainz, Germany, is working towards determining the polarizabilities of hadrons from nonperturbative quantum chromodynamics through Compton scattering experiments at low energies. The asymmetry observables are directly related to the scalar and spin polarizabilities of the hadrons. Online analysis software, which will give real-time feedback on asymmetries, efficiencies, energies, and angle distributions, has been developed. The new software is a big improvement over the existing online code and will greatly develop the quality of the acquired data.

  7. Performing data analysis using IBM SPSS

    CERN Document Server

    Meyers, Lawrence S; Guarino, A J

    2013-01-01

    This book is designed to be a user's guide for students and other interested readers to perform statistical data analysis with IBM SPSS, which is a major statistical software package used extensively in academic, government, and business settings. This book addresses the needs, level of sophistication, and interest in introductory statistical methodology on the part of undergraduate and graduate students in social and behavioral science, business, health-related, and education programs.  Each chapter covers a particular statistical procedure and has the following format: an example pr

  8. Statistical Analysis of Data for Timber Strengths

    DEFF Research Database (Denmark)

    Sørensen, John Dalsgaard

    2003-01-01

    Statistical analyses are performed for material strength parameters from a large number of specimens of structural timber. Non-parametric statistical analysis and fits have been investigated for the following distribution types: Normal, Lognormal, 2 parameter Weibull and 3-parameter Weibull...... fits to the data available, especially if tail fits are used whereas the Log Normal distribution generally gives a poor fit and larger coefficients of variation, especially if tail fits are used. The implications on the reliability level of typical structural elements and on partial safety factors...... for timber are investigated....

  9. Functional Data Analysis Applied in Chemometrics

    DEFF Research Database (Denmark)

    Muller, Martha

    nutritional status and metabolic phenotype. We want to understand how metabolomic spectra can be analysed using functional data analysis to detect the in uence of dierent factors on specic metabolites. These factors can include, for example, gender, diet culture or dietary intervention. In Paper I we apply...... representation of each spectrum. Subset selection of wavelet coecients generates the input to mixed models. Mixed-model methodology enables us to take the study design into account while modelling covariates. Bootstrap-based inference preserves the correlation structure between curves and enables the estimation...

  10. Hadron polarizability data analysis: GoAT

    Science.gov (United States)

    Stegen, H.; Collicott, C.; Hornidge, D.; Martel, P.; Ott, P.

    2015-12-01

    The A2 Collaboration at the Institute for Nuclear Physics in Mainz, Germany, is working towards determining the polarizabilities of hadrons from nonperturbative quantum chromodynamics through Compton scattering experiments at low energies. The asymmetry observables are directly related to the scalar and spin polarizabilities of the hadrons. Online analysis software, which will give real-time feedback on asymmetries, efficiencies, energies, and angle distributions, has been developed. The new software is a big improvement over the existing online code and will greatly develop the quality of the acquired data.

  11. Technical document characterization by data analysis

    International Nuclear Information System (INIS)

    Mauget, A.

    1993-05-01

    Nuclear power plants possess documents analyzing all the plant systems, which represents a vast quantity of paper. Analysis of textual data can enable a document to be classified by grouping the texts containing the same words. These methods are used on system manuals for feasibility studies. The system manual is then analyzed by LEXTER and the terms it has selected are examined. We first classify according to style (sentences containing general words, technical sentences, etc.), and then according to terms. However, it will not be possible to continue in this fashion for the 100 system manuals existing, because of lack of sufficient storage capacity. Another solution is being developed. (author)

  12. Optimization and Data Analysis in Biomedical Informatics

    CERN Document Server

    Pardalos, Panos M; Xanthopoulos, Petros

    2012-01-01

    This volume covers some of the topics that are related to the rapidly growing field of biomedical informatics. In June 11-12, 2010 a workshop entitled 'Optimization and Data Analysis in Biomedical Informatics' was organized at The Fields Institute. Following this event invited contributions were gathered based on the talks presented at the workshop, and additional invited chapters were chosen from world's leading experts. In this publication, the authors share their expertise in the form of state-of-the-art research and review chapters, bringing together researchers from different disciplines

  13. Hand geometry field application data analysis

    International Nuclear Information System (INIS)

    Ruehle, M.; Ahrens, J.

    1997-03-01

    Over the last fifteen years, Sandia National Laboratories Security Systems and Technology Center, Department 5800, has been involved in several laboratory tests of various biometric identification devices. These laboratory tests were conducted to verify the manufacturer's performance claims, to determine strengths and weaknesses of particular devices, and to evaluate which devices meet the US Department of Energy's unique needs for high-security devices. However, during a recent field installation of one of these devices, significantly different performance was observed than had been predicted by these laboratory tests. This report documents the data analysis performed in the search for an explanation of these differences

  14. A supercomputer for parallel data analysis

    International Nuclear Information System (INIS)

    Kolpakov, I.F.; Senner, A.E.; Smirnov, V.A.

    1987-01-01

    The project of a powerful multiprocessor system is proposed. The main purpose of the project is to develop a low cost computer system with a processing rate of a few tens of millions of operations per second. The system solves many problems of data analysis from high-energy physics spectrometers. It includes about 70 MOTOROLA-68020 based powerful slave microprocessor boards liaisoned through the VME crates to a host VAX micro computer. Each single microprocessor board performs the same algorithm requiring large computing time. The host computer distributes data over the microprocessor board, collects and combines obtained results. The architecture of the system easily allows one to use it in the real time mode

  15. Introduction to scientific computing and data analysis

    CERN Document Server

    Holmes, Mark H

    2016-01-01

    This textbook provides and introduction to numerical computing and its applications in science and engineering. The topics covered include those usually found in an introductory course, as well as those that arise in data analysis. This includes optimization and regression based methods using a singular value decomposition. The emphasis is on problem solving, and there are numerous exercises throughout the text concerning applications in engineering and science. The essential role of the mathematical theory underlying the methods is also considered, both for understanding how the method works, as well as how the error in the computation depends on the method being used. The MATLAB codes used to produce most of the figures and data tables in the text are available on the author’s website and SpringerLink.

  16. Systems Analysis for Interpretation of Phosphoproteomics Data

    DEFF Research Database (Denmark)

    Munk, Stephanie; Refsgaard, Jan C; Olsen, Jesper V

    2016-01-01

    Global phosphoproteomics investigations yield overwhelming datasets with up to tens of thousands of quantified phosphosites. The main challenge after acquiring such large-scale data is to extract the biological meaning and relate this to the experimental question at hand. Systems level analysis...... provides the best means for extracting functional insights from such types of datasets, and this has primed a rapid development of bioinformatics tools and resources over the last decade. Many of these tools are specialized databases that can be mined for annotation and pathway enrichment, whereas others...... provide a platform to generate functional protein networks and explore the relations between proteins of interest. The use of these tools requires careful consideration with regard to the input data, and the interpretation demands a critical approach. This chapter provides a summary of the most...

  17. Astrophysical data analysis with information field theory

    International Nuclear Information System (INIS)

    Enßlin, Torsten

    2014-01-01

    Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented

  18. Performance measurement with fuzzy data envelopment analysis

    CERN Document Server

    Tavana, Madjid

    2014-01-01

    The intensity of global competition and ever-increasing economic uncertainties has led organizations to search for more efficient and effective ways to manage their business operations.  Data envelopment analysis (DEA) has been widely used as a conceptually simple yet powerful tool for evaluating organizational productivity and performance. Fuzzy DEA (FDEA) is a promising extension of the conventional DEA proposed for dealing with imprecise and ambiguous data in performance measurement problems. This book is the first volume in the literature to present the state-of-the-art developments and applications of FDEA. It is designed for students, educators, researchers, consultants and practicing managers in business, industry, and government with a basic understanding of the DEA and fuzzy logic concepts.

  19. Surface Management System Departure Event Data Analysis

    Science.gov (United States)

    Monroe, Gilena A.

    2010-01-01

    This paper presents a data analysis of the Surface Management System (SMS) performance of departure events, including push-back and runway departure events.The paper focuses on the detection performance, or the ability to detect departure events, as well as the prediction performance of SMS. The results detail a modest overall detection performance of push-back events and a significantly high overall detection performance of runway departure events. The overall detection performance of SMS for push-back events is approximately 55%.The overall detection performance of SMS for runway departure events nears 100%. This paper also presents the overall SMS prediction performance for runway departure events as well as the timeliness of the Aircraft Situation Display for Industry data source for SMS predictions.

  20. The data analysis facilities that astronomers want

    International Nuclear Information System (INIS)

    Disney, M.

    1985-01-01

    This paper discusses the need and importance of data analysis facilities and what astronomers ideally want. A brief survey is presented of what is available now and some of the main deficiencies and problems with today's systems are discussed. The main sources of astronomical data are presented incuding: optical photographic, optical TV/CCD, VLA, optical spectros, imaging x-ray satellite, and satellite planetary camera. Landmark discoveries are listed in a table, some of which include: our galaxy as an island, distance to stars, H-R diagram (stellar structure), size of our galaxy, and missing mass in clusters. The main problems at present are discussed including lack of coordinated effort and central planning, differences in hardware, and measuring performance

  1. Data Analysis Methods for Library Marketing

    Science.gov (United States)

    Minami, Toshiro; Kim, Eunja

    Our society is rapidly changing to information society, where the needs and requests of the people on information access are different widely from person to person. Library's mission is to provide its users, or patrons, with the most appropriate information. Libraries have to know the profiles of their patrons, in order to achieve such a role. The aim of library marketing is to develop methods based on the library data, such as circulation records, book catalogs, book-usage data, and others. In this paper we discuss the methodology and imporatnce of library marketing at the beginning. Then we demonstrate its usefulness through some examples of analysis methods applied to the circulation records in Kyushu University and Guacheon Library, and some implication that obtained as the results of these methods. Our research is a big beginning towards the future when library marketing is an unavoidable tool.

  2. Astrophysical data analysis with information field theory

    Science.gov (United States)

    Enßlin, Torsten

    2014-12-01

    Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented.

  3. Astrophysical data analysis with information field theory

    Energy Technology Data Exchange (ETDEWEB)

    Enßlin, Torsten, E-mail: ensslin@mpa-garching.mpg.de [Max Planck Institut für Astrophysik, Karl-Schwarzschild-Straße 1, D-85748 Garching, Germany and Ludwig-Maximilians-Universität München, Geschwister-Scholl-Platz 1, D-80539 München (Germany)

    2014-12-05

    Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented.

  4. Opinion Analysis on Rohingya using Twitter Data

    Science.gov (United States)

    Rochmawati, N.; Wibawa, S. C.

    2018-04-01

    Rohingya is an ethnicity in Myanmar. Recently there was a conflict in the area between the Rakhine population and the Myanmar army. Many opinions are pro and contra in addressing this issue. There is a critic, there is a support and there is a neutral. The purpose of this paper is to analyze the world public opinion about the case of Rohingya. The opinion data to be processed is taken from twitter. the reason for using twitter is because twitter has become one of the popular social media and includes the most frequently visited social media. Therefore, it would be a lot of data that can be taken from twitter to be processed in the process of sentiment analysis. The grouping of opinions will be divided into 3 parts of positive, negative and neutral. the method used in grouping is the naïve Bayes method.

  5. Big Data Analysis of Manufacturing Processes

    Science.gov (United States)

    Windmann, Stefan; Maier, Alexander; Niggemann, Oliver; Frey, Christian; Bernardi, Ansgar; Gu, Ying; Pfrommer, Holger; Steckel, Thilo; Krüger, Michael; Kraus, Robert

    2015-11-01

    The high complexity of manufacturing processes and the continuously growing amount of data lead to excessive demands on the users with respect to process monitoring, data analysis and fault detection. For these reasons, problems and faults are often detected too late, maintenance intervals are chosen too short and optimization potential for higher output and increased energy efficiency is not sufficiently used. A possibility to cope with these challenges is the development of self-learning assistance systems, which identify relevant relationships by observation of complex manufacturing processes so that failures, anomalies and need for optimization are automatically detected. The assistance system developed in the present work accomplishes data acquisition, process monitoring and anomaly detection in industrial and agricultural processes. The assistance system is evaluated in three application cases: Large distillation columns, agricultural harvesting processes and large-scale sorting plants. In this paper, the developed infrastructures for data acquisition in these application cases are described as well as the developed algorithms and initial evaluation results.

  6. Functional data analysis of generalized regression quantiles

    KAUST Repository

    Guo, Mengmeng

    2013-11-05

    Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.

  7. Functional data analysis of generalized regression quantiles

    KAUST Repository

    Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl

    2013-01-01

    Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.

  8. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  9. Component fragilities. Data collection, analysis and interpretation

    International Nuclear Information System (INIS)

    Bandyopadhyay, K.K.; Hofmayer, C.H.

    1985-01-01

    As part of the component fragility research program sponsored by the US NRC, BNL is involved in establishing seismic fragility levels for various nuclear power plant equipment with emphasis on electrical equipment. To date, BNL has reviewed approximately seventy test reports to collect fragility or high level test data for switchgears, motor control centers and similar electrical cabinets, valve actuators and numerous electrical and control devices, e.g., switches, transmitters, potentiometers, indicators, relays, etc., of various manufacturers and models. BNL has also obtained test data from EPRI/ANCO. Analysis of the collected data reveals that fragility levels can best be described by a group of curves corresponding to various failure modes. The lower bound curve indicates the initiation of malfunctioning or structural damage, whereas the upper bound curve corresponds to overall failure of the equipment based on known failure modes occurring separately or interactively. For some components, the upper and lower bound fragility levels are observed to vary appreciably depending upon the manufacturers and models. For some devices, testing even at the shake table vibration limit does not exhibit any failure. Failure of a relay is observed to be a frequent cause of failure of an electrical panel or a system. An extensive amount of additional fregility or high level test data exists

  10. Analysis of capture-recapture data

    CERN Document Server

    McCrea, Rachel S

    2014-01-01

    An important first step in studying the demography of wild animals is to identify the animals uniquely through applying markings, such as rings, tags, and bands. Once the animals are encountered again, researchers can study different forms of capture-recapture data to estimate features, such as the mortality and size of the populations. Capture-recapture methods are also used in other areas, including epidemiology and sociology.With an emphasis on ecology, Analysis of Capture-Recapture Data covers many modern developments of capture-recapture and related models and methods and places them in the historical context of research from the past 100 years. The book presents both classical and Bayesian methods.A range of real data sets motivates and illustrates the material and many examples illustrate biometry and applied statistics at work. In particular, the authors demonstrate several of the modeling approaches using one substantial data set from a population of great cormorants. The book also discusses which co...

  11. Big Data Analysis of Manufacturing Processes

    International Nuclear Information System (INIS)

    Windmann, Stefan; Maier, Alexander; Niggemann, Oliver; Frey, Christian; Bernardi, Ansgar; Gu, Ying; Pfrommer, Holger; Steckel, Thilo; Krüger, Michael; Kraus, Robert

    2015-01-01

    The high complexity of manufacturing processes and the continuously growing amount of data lead to excessive demands on the users with respect to process monitoring, data analysis and fault detection. For these reasons, problems and faults are often detected too late, maintenance intervals are chosen too short and optimization potential for higher output and increased energy efficiency is not sufficiently used. A possibility to cope with these challenges is the development of self-learning assistance systems, which identify relevant relationships by observation of complex manufacturing processes so that failures, anomalies and need for optimization are automatically detected. The assistance system developed in the present work accomplishes data acquisition, process monitoring and anomaly detection in industrial and agricultural processes. The assistance system is evaluated in three application cases: Large distillation columns, agricultural harvesting processes and large-scale sorting plants. In this paper, the developed infrastructures for data acquisition in these application cases are described as well as the developed algorithms and initial evaluation results. (paper)

  12. Isothermal thermogravimetric data acquisition analysis system

    Science.gov (United States)

    Cooper, Kenneth, Jr.

    1991-01-01

    The description of an Isothermal Thermogravimetric Analysis (TGA) Data Acquisition System is presented. The system consists of software and hardware to perform a wide variety of TGA experiments. The software is written in ANSI C using Borland's Turbo C++. The hardware consists of a 486/25 MHz machine with a Capital Equipment Corp. IEEE488 interface card. The interface is to a Hewlett Packard 3497A data acquisition system using two analog input cards and a digital actuator card. The system provides for 16 TGA rigs with weight and temperature measurements from each rig. Data collection is conducted in three phases. Acquisition is done at a rapid rate during initial startup, at a slower rate during extended data collection periods, and finally at a fast rate during shutdown. Parameters controlling the rate and duration of each phase are user programmable. Furnace control (raising and lowering) is also programmable. Provision is made for automatic restart in the event of power failure or other abnormal terminations. Initial trial runs were conducted to show system stability.

  13. Nuclear data for proton activation analysis

    Energy Technology Data Exchange (ETDEWEB)

    Mukhammedov, S; Vasidov, A [Institute of Nuclear Physics of Academy of Sciences of Uzbekistan, 702132 Ulugbek, Tashkent (Uzbekistan); Comsan, M N.H. [Nuclear Research Centre, Inshas Cyclotron Facility, AEA 13759 Cairo (Egypt)

    2000-11-15

    The activation analysis with charged particles (ChPAA), as well as proton activation analysis (PAA), mainly requires separately irradiation of thick (thicker than the range of particles) samples and standard. Therefore for simplicity of determination of traces of chemical elements by instrumental PAA the absolute activity of the radionuclides must be known. Consequently we compilated data for nuclear decays (half life, radiation energy and intensity, type of decay, saturation factor), for nuclear reactions (excitation function, threshold energy, Q-value, yields of radionuclides), for the element under study (natural isotopic abundance of the nuclide, which yields the nuclear reaction considered, molar mass), stopping power of the irradiated material and the range of the particle that are used in the calculation of the absolute activity of the radionuclides and for the resolution of a nuclear interference problems of PAA. These data are tabulated. The tables of the radionuclides are presented in dependence on increasing atomic number and radiation energy as well as on methods of the radionuclide formation. The thick target yields of analytical radionuclides are presented versus particle energy.

  14. Complex surveys analysis of categorical data

    CERN Document Server

    Mukhopadhyay, Parimal

    2016-01-01

    The primary objective of this book is to study some of the research topics in the area of analysis of complex surveys which have not been covered in any book yet. It discusses the analysis of categorical data using three models: a full model, a log-linear model and a logistic regression model. It is a valuable resource for survey statisticians and practitioners in the field of sociology, biology, economics, psychology and other areas who have to use these procedures in their day-to-day work. It is also useful for courses on sampling and complex surveys at the upper-undergraduate and graduate levels. The importance of sample surveys today cannot be overstated. From voters’ behaviour to fields such as industry, agriculture, economics, sociology, psychology, investigators generally resort to survey sampling to obtain an assessment of the behaviour of the population they are interested in. Many large-scale sample surveys collect data using complex survey designs like multistage stratified cluster designs. The o...

  15. Scientific analysis of satellite ranging data

    Science.gov (United States)

    Smith, David E.

    1994-01-01

    A network of satellite laser ranging (SLR) tracking systems with continuously improving accuracies is challenging the modelling capabilities of analysts worldwide. Various data analysis techniques have yielded many advances in the development of orbit, instrument and Earth models. The direct measurement of the distance to the satellite provided by the laser ranges has given us a simple metric which links the results obtained by diverse approaches. Different groups have used SLR data, often in combination with observations from other space geodetic techniques, to improve models of the static geopotential, the solid Earth, ocean tides, and atmospheric drag models for low Earth satellites. Radiation pressure models and other non-conservative forces for satellite orbits above the atmosphere have been developed to exploit the full accuracy of the latest SLR instruments. SLR is the baseline tracking system for the altimeter missions TOPEX/Poseidon, and ERS-1 and will play an important role in providing the reference frame for locating the geocentric position of the ocean surface, in providing an unchanging range standard for altimeter calibration, and for improving the geoid models to separate gravitational from ocean circulation signals seen in the sea surface. However, even with the many improvements in the models used to support the orbital analysis of laser observations, there remain systematic effects which limit the full exploitation of SLR accuracy today.

  16. NPP unusual events: data, analysis and application

    International Nuclear Information System (INIS)

    Tolstykh, V.

    1990-01-01

    Subject of the paper are the IAEA cooperative patterns of unusual events data treatment and utilization of the operating safety experience feedback. The Incident Reporting System (IRS) and the Analysis of Safety Significant Event Team (ASSET) are discussed. The IRS methodology in collection, handling, assessment and dissemination of data on NPP unusual events (deviations, incidents and accidents) occurring during operations, surveillance and maintenance is outlined by the reports gathering and issuing practice, the experts assessment procedures and the parameters of the system. After 7 years of existence the IAEA-IRS contains over 1000 reports and receives 1.5-4% of the total information on unusual events. The author considers the reports only as detailed technical 'records' of events requiring assessment. The ASSET approaches implying an in-depth occurrences analysis directed towards level-1 PSA utilization are commented on. The experts evaluated root causes for the reported events and some trends are presented. Generally, internal events due to unexpected paths of water in the nuclear installations, occurrences related to the integrity of the primary heat transport systems, events associated with the engineered safety systems and events involving human factor represent the large groups deserving close attention. Personal recommendations on how to use the events related information use for NPP safety improvement are given. 2 tabs (R.Ts)

  17. Dependent failure analysis of NPP data bases

    International Nuclear Information System (INIS)

    Cooper, S.E.; Lofgren, E.V.; Samanta, P.K.; Wong Seemeng

    1993-01-01

    A technical approach for analyzing plant-specific data bases for vulnerabilities to dependent failures has been developed and applied. Since the focus of this work is to aid in the formulation of defenses to dependent failures, rather than to quantify dependent failure probabilities, the approach of this analysis is critically different. For instance, the determination of component failure dependencies has been based upon identical failure mechanisms related to component piecepart failures, rather than failure modes. Also, component failures involving all types of component function loss (e.g., catastrophic, degraded, incipient) are equally important to the predictive purposes of dependent failure defense development. Consequently, dependent component failures are identified with a different dependent failure definition which uses a component failure mechanism categorization scheme in this study. In this context, clusters of component failures which satisfy the revised dependent failure definition are termed common failure mechanism (CFM) events. Motor-operated valves (MOVs) in two nuclear power plant data bases have been analyzed with this approach. The analysis results include seven different failure mechanism categories; identified potential CFM events; an assessment of the risk-significance of the potential CFM events using existing probabilistic risk assessments (PRAs); and postulated defenses to the identified potential CFM events. (orig.)

  18. Nuclear data for proton activation analysis

    International Nuclear Information System (INIS)

    Mukhammedov, S.; Vasidov, A.; Comsan, M.N.H.

    2000-01-01

    The activation analysis with charged particles (ChPAA), as well as proton activation analysis (PAA), mainly requires separately irradiation of thick (thicker than the range of particles) samples and standard. Therefore for simplicity of determination of traces of chemical elements by instrumental PAA the absolute activity of the radionuclides must be known. Consequently we compilated data for nuclear decays (half life, radiation energy and intensity, type of decay, saturation factor), for nuclear reactions (excitation function, threshold energy, Q-value, yields of radionuclides), for the element under study (natural isotopic abundance of the nuclide, which yields the nuclear reaction considered, molar mass), stopping power of the irradiated material and the range of the particle that are used in the calculation of the absolute activity of the radionuclides and for the resolution of a nuclear interference problems of PAA. These data are tabulated. The tables of the radionuclides are presented in dependence on increasing atomic number and radiation energy as well as on methods of the radionuclide formation. The thick target yields of analytical radionuclides are presented versus particle energy

  19. Techniques and Applications of Urban Data Analysis

    KAUST Repository

    AlHalawani, Sawsan N.

    2016-05-26

    Digitization and characterization of urban spaces are essential components as we move to an ever-growing ’always connected’ world. Accurate analysis of such digital urban spaces has become more important as we continue to get spatial and social context-aware feedback and recommendations in our daily activities. Modeling and reconstruction of urban environments have thus gained unprecedented importance in the last few years. Such analysis typically spans multiple disciplines, such as computer graphics, and computer vision as well as architecture, geoscience, and remote sensing. Reconstructing an urban environment usually requires an entire pipeline consisting of different tasks. In such a pipeline, data analysis plays a strong role in acquiring meaningful insights from the raw data. This dissertation primarily focuses on the analysis of various forms of urban data and proposes a set of techniques to extract useful information, which is then used for different applications. The first part of this dissertation presents a semi-automatic framework to analyze facade images to recover individual windows along with their functional configurations such as open or (partially) closed states. The main advantage of recovering both the repetition patterns of windows and their individual deformation parameters is to produce a factored facade representation. Such a factored representation enables a range of applications including interactive facade images, improved multi-view stereo reconstruction, facade-level change detection, and novel image editing possibilities. The second part of this dissertation demonstrates the importance of a layout configuration on its performance. As a specific application scenario, I investigate the interior layout of warehouses wherein the goal is to assign items to their storage locations while reducing flow congestion and enhancing the speed of order picking processes. The third part of the dissertation proposes a method to classify cities

  20. Frontier Assignment for Sensitivity Analysis of Data Envelopment Analysis

    Science.gov (United States)

    Naito, Akio; Aoki, Shingo; Tsuji, Hiroshi

    To extend the sensitivity analysis capability for DEA (Data Envelopment Analysis), this paper proposes frontier assignment based DEA (FA-DEA). The basic idea of FA-DEA is to allow a decision maker to decide frontier intentionally while the traditional DEA and Super-DEA decide frontier computationally. The features of FA-DEA are as follows: (1) provides chances to exclude extra-influential DMU (Decision Making Unit) and finds extra-ordinal DMU, and (2) includes the function of the traditional DEA and Super-DEA so that it is able to deal with sensitivity analysis more flexibly. Simple numerical study has shown the effectiveness of the proposed FA-DEA and the difference from the traditional DEA.

  1. Advantages of Integrative Data Analysis for Developmental Research

    Science.gov (United States)

    Bainter, Sierra A.; Curran, Patrick J.

    2015-01-01

    Amid recent progress in cognitive development research, high-quality data resources are accumulating, and data sharing and secondary data analysis are becoming increasingly valuable tools. Integrative data analysis (IDA) is an exciting analytical framework that can enhance secondary data analysis in powerful ways. IDA pools item-level data across…

  2. Hartsville data and analysis book: Phase I

    Energy Technology Data Exchange (ETDEWEB)

    Kerley, C.R.; Siegrist, C.

    1978-09-01

    A preconstruction data base is recorded for the impact area surrounding the Hartsville nuclear construction project. The objective is to document baseline information for socioeconomic characteristics that may be either temporarily or permanently altered by the project. The analysis suggests that the five counties surrounding the site make up a primary impact area, but some impacts may occur outside the area. The work force for the construction phase of the project is segregated into four components: (1) former residents of the site county, (2) former residents of other counties in the impact area, (3) in-movers to the site county, and (4) in-movers to other counties in the impact area. A theoretical model is developed to illustrate the contribution of each component to the spatial pattern of economic benefits and social costs in the impact area. A shift-share analysis of agricultural characteristics in the impact area shows that employment and farm numbers in the area have declined at a slightly faster rate than in the nation but at a slower rate than in the South. A population and construction project threshold analysis suggests that, given the project size and population base at Hartsville, significant social and economic constraints may be encountered in the public and private economic infrastructure. These include amenities such as housing, school space, medical and police protection.

  3. Hartsville data and analysis book: Phase I

    International Nuclear Information System (INIS)

    Kerley, C.R.; Siegrist, C.

    1978-09-01

    A preconstruction data base is recorded for the impact area surrounding the Hartsville nuclear construction project. The objective is to document baseline information for socioeconomic characteristics that may be either temporarily or permanently altered by the project. The analysis suggests that the five counties surrounding the site make up a primary impact area, but some impacts may occur outside the area. The work force for the construction phase of the project is segregated into four components: (1) former residents of the site county, (2) former residents of other counties in the impact area, (3) in-movers to the site county, and (4) in-movers to other counties in the impact area. A theoretical model is developed to illustrate the contribution of each component to the spatial pattern of economic benefits and social costs in the impact area. A shift-share analysis of agricultural characteristics in the impact area shows that employment and farm numbers in the area have declined at a slightly faster rate than in the nation but at a slower rate than in the South. A population and construction project threshold analysis suggests that, given the project size and population base at Hartsville, significant social and economic constraints may be encountered in the public and private economic infrastructure. These include amenities such as housing, school space, medical and police protection

  4. Mobile gamma spectrometry with remote data analysis

    International Nuclear Information System (INIS)

    Anttalainen, O.; Toivonen, H.

    2009-01-01

    There are several devices on the market designed for the detection and identification of a radiation source. The widely used approach for this is to use sensitive scintillation or semiconductor detectors together with software algorithms to get the alarms on-site in real time. The devices may be used in covert operations during major public events such as in international sports events or at political meetings. The screening and surveys are prone to false alarms due to the variability of the natural radiation or to legal radiation sources such as patients who have received radioisotope treatment recently. The correct interpretation of the spectrometric signal is a task of a nuclear specialist; every instrument user is not expected to have such knowledge, and therefore, there is a substantial risk to misinterpret the result given by the instrument. The consequences of a false alarm can be dramatic, and therefore, from the operational point of view correct alarm handling is a key capability. Environics Oy has commercialized the measurement and an analysis concept developed by STUK (Radiation and Nuclear Safety Authority in Finland). This concept includes high performance spectrometric analysis, local and remote data analysis, including wireless online connection to expert systems and expert support allowing multi-user-single-expert (MUSE) operations.(author)

  5. CMS distributed data analysis with CRAB3

    Science.gov (United States)

    Mascheroni, M.; Balcas, J.; Belforte, S.; Bockelman, B. P.; Hernandez, J. M.; Ciangottini, D.; Konstantinov, P. B.; Silva, J. M. D.; Ali, M. A. B. M.; Melo, A. M.; Riahi, H.; Tanasijczuk, A. J.; Yusli, M. N. B.; Wolf, M.; Woodard, A. E.; Vaandering, E.

    2015-12-01

    The CMS Remote Analysis Builder (CRAB) is a distributed workflow management tool which facilitates analysis tasks by isolating users from the technical details of the Grid infrastructure. Throughout LHC Run 1, CRAB has been successfully employed by an average of 350 distinct users each week executing about 200,000 jobs per day. CRAB has been significantly upgraded in order to face the new challenges posed by LHC Run 2. Components of the new system include 1) a lightweight client, 2) a central primary server which communicates with the clients through a REST interface, 3) secondary servers which manage user analysis tasks and submit jobs to the CMS resource provisioning system, and 4) a central service to asynchronously move user data from temporary storage in the execution site to the desired storage location. The new system improves the robustness, scalability and sustainability of the service. Here we provide an overview of the new system, operation, and user support, report on its current status, and identify lessons learned from the commissioning phase and production roll-out.

  6. Sentimental Analysis for Airline Twitter data

    Science.gov (United States)

    Dutta Das, Deb; Sharma, Sharan; Natani, Shubham; Khare, Neelu; Singh, Brijendra

    2017-11-01

    Social Media has taken the world by surprise at a swift and commendable pace. With the advent of any kind of circumstances may it be related to social, political or current affairs the sentiments of people throughout the world are expressed through their help, making them suitable candidates for sentiment mining. Sentimental analysis becomes highly resourceful for any organization who wants to analyse and enhance their products and services. In the airline industries it is much easier to get feedback from astute data source such as Twitter, for conducting a sentiment analysis on their respective customers. The beneficial factors relating to twitter sentiment analysis cannot be impeded by the consumers who want to know the who’s who and what’s what in everyday life. In this paper we are classifying sentiment of Twitter messages by exhibiting results of a machine learning algorithm using R and Rapid Miner. The tweets are extracted and pre-processed and then categorizing them in neutral, negative and positive sentiments finally summarising the results as a whole. The Naive Bayes algorithm has been used for classifying the sentiments of recent tweets done on the different airlines.

  7. Network analysis for the visualization and analysis of qualitative data.

    Science.gov (United States)

    Pokorny, Jennifer J; Norman, Alex; Zanesco, Anthony P; Bauer-Wu, Susan; Sahdra, Baljinder K; Saron, Clifford D

    2018-03-01

    We present a novel manner in which to visualize the coding of qualitative data that enables representation and analysis of connections between codes using graph theory and network analysis. Network graphs are created from codes applied to a transcript or audio file using the code names and their chronological location. The resulting network is a representation of the coding data that characterizes the interrelations of codes. This approach enables quantification of qualitative codes using network analysis and facilitates examination of associations of network indices with other quantitative variables using common statistical procedures. Here, as a proof of concept, we applied this method to a set of interview transcripts that had been coded in 2 different ways and the resultant network graphs were examined. The creation of network graphs allows researchers an opportunity to view and share their qualitative data in an innovative way that may provide new insights and enhance transparency of the analytical process by which they reach their conclusions. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  8. Analysis using large-scale ringing data

    Directory of Open Access Journals (Sweden)

    Baillie, S. R.

    2004-06-01

    ]; Peach et al., 1998; DeSante et al., 2001 are generally co–ordinated by ringing centres such as those that make up the membership of EURING. In some countries volunteer census work (often called Breeding Bird Surveys is undertaken by the same organizations while in others different bodies may co–ordinate this aspect of the work. This session was concerned with the analysis of such extensive data sets and the approaches that are being developed to address the key theoretical and applied issues outlined above. The papers reflect the development of more spatially explicit approaches to analyses of data gathered at large spatial scales. They show that while the statistical tools that have been developed in recent years can be used to derive useful biological conclusions from such data, there is additional need for further developments. Future work should also consider how to best implement such analytical developments within future study designs. In his plenary paper Andy Royle (Royle, 2004 addresses this theme directly by describing a general framework for modelling spatially replicated abundance data. The approach is based on the idea that a set of spatially referenced local populations constitutes a metapopulation, within which local abundance is determined as a random process. This provides an elegant and general approach in which the metapopulation model as described above is combined with a data–generating model specific to the type of data being analysed to define a simple hierarchical model that can be analysed using conventional methods. It should be noted, however, that further software development will be needed if the approach is to be made readily available to biologists. The approach is well suited to dealing with sparse data and avoids the need for data aggregation prior to analysis. Spatial synchrony has received most attention in studies of species whose populations show cyclic fluctuations, particularly certain game birds and small mammals. However

  9. Novel mathematical algorithm for pupillometric data analysis.

    Science.gov (United States)

    Canver, Matthew C; Canver, Adam C; Revere, Karen E; Amado, Defne; Bennett, Jean; Chung, Daniel C

    2014-01-01

    Pupillometry is used clinically to evaluate retinal and optic nerve function by measuring pupillary response to light stimuli. We have developed a mathematical algorithm to automate and expedite the analysis of non-filtered, non-calculated pupillometric data obtained from mouse pupillary light reflex recordings, obtained from dynamic pupillary diameter recordings following exposure of varying light intensities. The non-filtered, non-calculated pupillometric data is filtered through a low pass finite impulse response (FIR) filter. Thresholding is used to remove data caused by eye blinking, loss of pupil tracking, and/or head movement. Twelve physiologically relevant parameters were extracted from the collected data: (1) baseline diameter, (2) minimum diameter, (3) response amplitude, (4) re-dilation amplitude, (5) percent of baseline diameter, (6) response time, (7) re-dilation time, (8) average constriction velocity, (9) average re-dilation velocity, (10) maximum constriction velocity, (11) maximum re-dilation velocity, and (12) onset latency. No significant differences were noted between parameters derived from algorithm calculated values and manually derived results (p ≥ 0.05). This mathematical algorithm will expedite endpoint data derivation and eliminate human error in the manual calculation of pupillometric parameters from non-filtered, non-calculated pupillometric values. Subsequently, these values can be used as reference metrics for characterizing the natural history of retinal disease. Furthermore, it will be instrumental in the assessment of functional visual recovery in humans and pre-clinical models of retinal degeneration and optic nerve disease following pharmacological or gene-based therapies. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. Compilation of data for radionuclide transport analysis

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-11-01

    This report is one of the supporting documents to the updated safety assessment (project SAFE) of the Swedish repository for low and intermediate level waste, SFR 1. A number of calculation cases for quantitative analysis of radionuclide release and dose to man are defined based on the expected evolution of the repository, geosphere and biosphere in the Base Scenario and other scenarios selected. The data required by the selected near field, geosphere and biosphere models are given and the values selected for the calculations are compiled in tables. The main sources for the selected values of the migration parameters in the repository and geosphere models are the safety assessment of a deep repository for spent fuel, SR 97, and the preliminary safety assessment of a repository for long-lived, low- and intermediate level waste, SFL 3-5. For the biosphere models, both site-specific data and generic values of the parameters are selected. The applicability of the selected parameter values is discussed and the uncertainty is qualitatively addressed for data to the repository and geosphere migration models. Parameter values selected for these models are in general pessimistic in order not to underestimate the radionuclide release rates. It is judged that this approach combined with the selected calculation cases will illustrate the effects of uncertainties in processes and events that affects the evolution of the system as well as in quantitative data that describes this. The biosphere model allows for probabilistic calculations and the uncertainty in input data are quantified by giving minimum, maximum and mean values as well as the type of probability distribution function.

  11. Compilation of data for radionuclide transport analysis

    International Nuclear Information System (INIS)

    2001-11-01

    This report is one of the supporting documents to the updated safety assessment (project SAFE) of the Swedish repository for low and intermediate level waste, SFR 1. A number of calculation cases for quantitative analysis of radionuclide release and dose to man are defined based on the expected evolution of the repository, geosphere and biosphere in the Base Scenario and other scenarios selected. The data required by the selected near field, geosphere and biosphere models are given and the values selected for the calculations are compiled in tables. The main sources for the selected values of the migration parameters in the repository and geosphere models are the safety assessment of a deep repository for spent fuel, SR 97, and the preliminary safety assessment of a repository for long-lived, low- and intermediate level waste, SFL 3-5. For the biosphere models, both site-specific data and generic values of the parameters are selected. The applicability of the selected parameter values is discussed and the uncertainty is qualitatively addressed for data to the repository and geosphere migration models. Parameter values selected for these models are in general pessimistic in order not to underestimate the radionuclide release rates. It is judged that this approach combined with the selected calculation cases will illustrate the effects of uncertainties in processes and events that affects the evolution of the system as well as in quantitative data that describes this. The biosphere model allows for probabilistic calculations and the uncertainty in input data are quantified by giving minimum, maximum and mean values as well as the type of probability distribution function

  12. The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): Data Analysis and Visualization for Geoscience Data

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Dean [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Doutriaux, Charles [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Patchett, John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Williams, Sean [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Shipman, Galen [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Miller, Ross [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Steed, Chad [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Krishnan, Harinarayan [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Silva, Claudio [NYU Polytechnic School of Engineering, New York, NY (United States); Chaudhary, Aashish [Kitware, Inc., Clifton Park, NY (United States); Bremer, Peer-Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pugmire, David [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Bethel, E. Wes [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Childs, Hank [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Prabhat, Mr. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States); Bauer, Andrew [Kitware, Inc., Clifton Park, NY (United States); Pletzer, Alexander [Tech-X Corp., Boulder, CO (United States); Poco, Jorge [NYU Polytechnic School of Engineering, New York, NY (United States); Ellqvist, Tommy [NYU Polytechnic School of Engineering, New York, NY (United States); Santos, Emanuele [Federal Univ. of Ceara, Fortaleza (Brazil); Potter, Gerald [NASA Johnson Space Center, Houston, TX (United States); Smith, Brian [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Maxwell, Thomas [NASA Johnson Space Center, Houston, TX (United States); Kindig, David [Tech-X Corp., Boulder, CO (United States); Koop, David [NYU Polytechnic School of Engineering, New York, NY (United States)

    2013-05-01

    To support interactive visualization and analysis of complex, large-scale climate data sets, UV-CDAT integrates a powerful set of scientific computing libraries and applications to foster more efficient knowledge discovery. Connected through a provenance framework, the UV-CDAT components can be loosely coupled for fast integration or tightly coupled for greater functionality and communication with other components. This framework addresses many challenges in the interactive visual analysis of distributed large-scale data for the climate community.

  13. Analysis of temperature data at the Olkiluoto

    Energy Technology Data Exchange (ETDEWEB)

    Sedighi, M.; Bennett, D.; Masum, S.; Thomas, H. [Cardiff Univ. (United Kingdom); Johansson, E. [Saanio and Riekkola Oy, Helsinki (Finland)

    2014-03-15

    As part of the rock mechanics monitoring programme 2012 at Olkiluoto, temperature data have been recorded. Temperature data have been measured, collected and monitored at the Olkiluoto site and in ONKALO in various locations, by different methods and in conjunction with other investigations carried out at the site. This report provides a detailed description of the investigation and analysis carried out on temperature datasets. This report aims to provide a better understanding of the in-situ temperature of the rock and soil at the site. Three categories of datasets have been analysed and studied from the Posiva thermal monitoring programme. These consist of: (i) data collected from the various drillholes during geophysical logging and Posiva Flow Log (PFL) measurements, (ii) measurements in the ONKALO ramp, the investigation niche located at elevation -140 m and a technical room located at 437 m below the surface, and (iii) surface temperature measurements from four weather stations and four measurement ditches. Time-series data obtained from the groundwater temperature measurements during the 'Posiva Flow Log' (PFL) tests in drillholes OL-KR1 to KR55 at different depths and years have been analysed. Temperature at a depth of 400 m was found to be in the range of 10 to 11 deg C. The geothermal gradient obtained from the PFL data without pumping was found to be approximately 1.4 deg C/100m with relatively uniform temporal and spatial patterns at the repository depth, i.e. at 400 m.The geothermal gradient obtained from the results of the PFL measurements and geophysical loggings indicate similar temperature values at the repository depths, i.e. 400 m. The characteristics of the time series data related to the ONKALO measurements, have been obtained through a series of Non-uniform Discrete Fourier Transform analysis Datasets related to the various chainages and investigation niche at ONKALO have been studied. The largest variation in the temperature

  14. Analysis of temperature data at the Olkiluoto

    International Nuclear Information System (INIS)

    Sedighi, M.; Bennett, D.; Masum, S.; Thomas, H.; Johansson, E.

    2014-03-01

    As part of the rock mechanics monitoring programme 2012 at Olkiluoto, temperature data have been recorded. Temperature data have been measured, collected and monitored at the Olkiluoto site and in ONKALO in various locations, by different methods and in conjunction with other investigations carried out at the site. This report provides a detailed description of the investigation and analysis carried out on temperature datasets. This report aims to provide a better understanding of the in-situ temperature of the rock and soil at the site. Three categories of datasets have been analysed and studied from the Posiva thermal monitoring programme. These consist of: (i) data collected from the various drillholes during geophysical logging and Posiva Flow Log (PFL) measurements, (ii) measurements in the ONKALO ramp, the investigation niche located at elevation -140 m and a technical room located at 437 m below the surface, and (iii) surface temperature measurements from four weather stations and four measurement ditches. Time-series data obtained from the groundwater temperature measurements during the 'Posiva Flow Log' (PFL) tests in drillholes OL-KR1 to KR55 at different depths and years have been analysed. Temperature at a depth of 400 m was found to be in the range of 10 to 11 deg C. The geothermal gradient obtained from the PFL data without pumping was found to be approximately 1.4 deg C/100m with relatively uniform temporal and spatial patterns at the repository depth, i.e. at 400 m.The geothermal gradient obtained from the results of the PFL measurements and geophysical loggings indicate similar temperature values at the repository depths, i.e. 400 m. The characteristics of the time series data related to the ONKALO measurements, have been obtained through a series of Non-uniform Discrete Fourier Transform analysis Datasets related to the various chainages and investigation niche at ONKALO have been studied. The largest variation in the temperature amplitude of data

  15. Analysis of Fire Data in Oman

    Directory of Open Access Journals (Sweden)

    K.S. Al-Jabri

    2003-06-01

    Full Text Available The aim of this study is to illustrate the problem of fire accidents in the Sultanate of Oman and their causes in order to find out how the existing data could be used as a base to improve fire resistance, to detect the weak points (vulnerability to fire in existing structures, and to minimize fire occurrences in places where it is high. This will also provide useful recommendations with regard to fire safety including causes, people’s awareness and education, etc.  Fire data in Oman were collected from two sources: The Directorate General of Civil Defence (Public Relations Department and Sultan Qaboos University library. The collected data represent the number of fires in Oman during the last decade.  It also includes fire distribution by type and averages.  The analysis shows that there is a linear increase in the number of fire accidents in the last decade with time.  Many factors are included as potential sources, which are explained in the paper, and suggestions are made for possible control.

  16. MCNP output data analysis with ROOT (MODAR)

    Science.gov (United States)

    Carasco, C.

    2010-12-01

    MCNP Output Data Analysis with ROOT (MODAR) is a tool based on CERN's ROOT software. MODAR has been designed to handle time-energy data issued by MCNP simulations of neutron inspection devices using the associated particle technique. MODAR exploits ROOT's Graphical User Interface and functionalities to visualize and process MCNP simulation results in a fast and user-friendly way. MODAR allows to take into account the detection system time resolution (which is not possible with MCNP) as well as detectors energy response function and counting statistics in a straightforward way. New version program summaryProgram title: MODAR Catalogue identifier: AEGA_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGA_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 150 927 No. of bytes in distributed program, including test data, etc.: 4 981 633 Distribution format: tar.gz Programming language: C++ Computer: Most Unix workstations and PCs Operating system: Most Unix systems, Linux and windows, provided the ROOT package has been installed. Examples where tested under Suse Linux and Windows XP. RAM: Depends on the size of the MCNP output file. The example presented in the article, which involves three two dimensional 139×740 bins histograms, allocates about 60 MB. These data are running under ROOT and include consumption by ROOT itself. Classification: 17.6 Catalogue identifier of previous version: AEGA_v1_0 Journal reference of previous version: Comput. Phys. Comm. 181 (2010) 1161 External routines: ROOT version 5.24.00 ( http://root.cern.ch/drupal/) Does the new version supersede the previous version?: Yes Nature of problem: The output of a MCNP simulation is an ascii file. The data processing is usually performed by copying and pasting the relevant parts of the ascii

  17. 40 CFR 51.366 - Data analysis and reporting.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 2 2010-07-01 2010-07-01 false Data analysis and reporting. 51.366... Requirements § 51.366 Data analysis and reporting. Data analysis and reporting are required to allow for..., including the results of an analysis of the registration data base; (ii) The percentage of motorist...

  18. Sandia National Laboratories analysis code data base

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, C.W.

    1994-11-01

    Sandia National Laboratories, mission is to solve important problems in the areas of national defense, energy security, environmental integrity, and industrial technology. The Laboratories` strategy for accomplishing this mission is to conduct research to provide an understanding of the important physical phenomena underlying any problem, and then to construct validated computational models of the phenomena which can be used as tools to solve the problem. In the course of implementing this strategy, Sandia`s technical staff has produced a wide variety of numerical problem-solving tools which they use regularly in the design, analysis, performance prediction, and optimization of Sandia components, systems and manufacturing processes. This report provides the relevant technical and accessibility data on the numerical codes used at Sandia, including information on the technical competency or capability area that each code addresses, code ``ownership`` and release status, and references describing the physical models and numerical implementation.

  19. Latent class models in financial data analysis

    Directory of Open Access Journals (Sweden)

    Attilio Gardini

    2007-10-01

    Full Text Available This paper deals with optimal international portfolio choice by developing a latent class approach based on the distinction between international and non-international investors. On the basis of micro data, we analyze the effects of many social, demographic, economic and financial characteristics on the probability to be an international investor. Traditional measures of equity home bias do not allow for the existence of international investment rationing operators. On the contrary, by resorting to latent class analysis it is possible to detect the unobservable distinction between international investors and investors who are precluded from operating into international financial markets and, therefore, to evaluate the role of these unobservable constraints on equity home bias.

  20. Social phenomena from data analysis to models

    CERN Document Server

    Perra, Nicola

    2015-01-01

    This book focuses on the new possibilities and approaches to social modeling currently being made possible by an unprecedented variety of datasets generated by our interactions with modern technologies. This area has witnessed a veritable explosion of activity over the last few years, yielding many interesting and useful results. Our aim is to provide an overview of the state of the art in this area of research, merging an extremely heterogeneous array of datasets and models. Social Phenomena: From Data Analysis to Models is divided into two parts. Part I deals with modeling social behavior under normal conditions: How we live, travel, collaborate and interact with each other in our daily lives. Part II deals with societal behavior under exceptional conditions: Protests, armed insurgencies, terrorist attacks, and reactions to infectious diseases. This book offers an overview of one of the most fertile emerging fields bringing together practitioners from scientific communities as diverse as social sciences, p...

  1. Sandia National Laboratories analysis code data base

    Science.gov (United States)

    Peterson, C. W.

    1994-11-01

    Sandia National Laboratories' mission is to solve important problems in the areas of national defense, energy security, environmental integrity, and industrial technology. The laboratories' strategy for accomplishing this mission is to conduct research to provide an understanding of the important physical phenomena underlying any problem, and then to construct validated computational models of the phenomena which can be used as tools to solve the problem. In the course of implementing this strategy, Sandia's technical staff has produced a wide variety of numerical problem-solving tools which they use regularly in the design, analysis, performance prediction, and optimization of Sandia components, systems, and manufacturing processes. This report provides the relevant technical and accessibility data on the numerical codes used at Sandia, including information on the technical competency or capability area that each code addresses, code 'ownership' and release status, and references describing the physical models and numerical implementation.

  2. Data envelopment analysis of randomized ranks

    Directory of Open Access Journals (Sweden)

    Sant'Anna Annibal P.

    2002-01-01

    Full Text Available Probabilities and odds, derived from vectors of ranks, are here compared as measures of efficiency of decision-making units (DMUs. These measures are computed with the goal of providing preliminary information before starting a Data Envelopment Analysis (DEA or the application of any other evaluation or composition of preferences methodology. Preferences, quality and productivity evaluations are usually measured with errors or subject to influence of other random disturbances. Reducing evaluations to ranks and treating the ranks as estimates of location parameters of random variables, we are able to compute the probability of each DMU being classified as the best according to the consumption of each input and the production of each output. Employing the probabilities of being the best as efficiency measures, we stretch distances between the most efficient units. We combine these partial probabilities in a global efficiency score determined in terms of proximity to the efficiency frontier.

  3. Data analysis for seismic motion characteristics

    International Nuclear Information System (INIS)

    Ishimaru, Tsuneari; Kohriya, Yorihide

    2002-10-01

    This data analysis is aimed at studying the characteristics of amplification of acceleration amplitude from deep underground to the surface, and is one of several continuous studies on the effects of earthquake motion. Seismic wave records were observed via a center array located in Shibata-cho, Miyagi Prefecture, which is part of the Kumagai-Gumi Array System for Strong Earthquake Motion (KASSEM) located on the Pacific coast in Miyagi and Fukushima Prefectures. Using acceleration waves obtained from earthquake observations, the amplification ratios of maximum acceleration amplitude and of root mean square acceleration amplitude which were based on the deepest observation point were estimated. Comparison between the seismic motion amplification characteristics of this study were made with the analyzed data at the Kamaishi-Mine (Kamaishi Miyagi Prefecture). The obtained results are as follows. The amplification ratios estimated from maximum acceleration amplitude and root mean square acceleration amplitude are almost constant in soft rock formations. However, amplification ratios at the surface in diluvium and alluvium are about three to four times larger than the ratios in soft rock formations. The amplification ratios estimated from root mean square acceleration amplitude are less dispersed than the ratios estimated from maximum acceleration amplitude. Comparing the results of this analysis with the results obtained at the Kamaishi-Mine, despite the difference in the rock types and the geologic formations at the observation points, there is a tendency for the amplification ratios at both points to be relatively small in the rock foundation and gradually increase toward the ground surface. (author)

  4. Principles of gene microarray data analysis.

    Science.gov (United States)

    Mocellin, Simone; Rossi, Carlo Riccardo

    2007-01-01

    The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.

  5. Abstract Interfaces for Data Analysis Component Architecture for Data Analysis Tools

    CERN Document Server

    Barrand, G; Dönszelmann, M; Johnson, A; Pfeiffer, A

    2001-01-01

    The fast turnover of software technologies, in particular in the domain of interactivity (covering user interface and visualisation), makes it difficult for a small group of people to produce complete and polished software-tools before the underlying technologies make them obsolete. At the HepVis '99 workshop, a working group has been formed to improve the production of software tools for data analysis in HENP. Beside promoting a distributed development organisation, one goal of the group is to systematically design a set of abstract interfaces based on using modern OO analysis and OO design techniques. An initial domain analysis has come up with several categories (components) found in typical data analysis tools: Histograms, Ntuples, Functions, Vectors, Fitter, Plotter, Analyzer and Controller. Special emphasis was put on reducing the couplings between the categories to a minimum, thus optimising re-use and maintainability of any component individually. The interfaces have been defined in Java and C++ and i...

  6. Data Collection, Collaboration, Analysis, and Publication Using the Open Data Repository's (ODR) Data Publisher

    Science.gov (United States)

    Lafuente, B.; Stone, N.; Bristow, T.; Keller, R. M.; Blake, D. F.; Downs, R. T.; Pires, A.; Dateo, C. E.; Fonda, M.

    2017-12-01

    In development for nearly four years, the Open Data Repository's (ODR) Data Publisher software has become a useful tool for researchers' data needs. Data Publisher facilitates the creation of customized databases with flexible permission sets that allow researchers to share data collaboratively while improving data discovery and maintaining ownership rights. The open source software provides an end-to-end solution from collection to final repository publication. A web-based interface allows researchers to enter data, view data, and conduct analysis using any programming language supported by JupyterHub (http://www.jupyterhub.org). This toolset makes it possible for a researcher to store and manipulate their data in the cloud from any internet capable device. Data can be embargoed in the system until a date selected by the researcher. For instance, open publication can be set to a date that coincides with publication of data analysis in a third party journal. In conjunction with teams at NASA Ames and the University of Arizona, a number of pilot studies are being conducted to guide the software development so that it allows them to publish and share their data. These pilots include (1) the Astrobiology Habitable Environments Database (AHED), a central searchable repository designed to promote and facilitate the integration and sharing of all the data generated by the diverse disciplines in astrobiology; (2) a database containing the raw and derived data products from the CheMin instrument on the MSL rover Curiosity (http://odr.io/CheMin), featuring a versatile graphing system, instructions and analytical tools to process the data, and a capability to download data in different formats; and (3) the Mineral Evolution project, which by correlating the diversity of mineral species with their ages, localities, and other measurable properties aims to understand how the episodes of planetary accretion and differentiation, plate tectonics, and origin of life lead to a

  7. Computer programs for analysis of geophysical data

    Energy Technology Data Exchange (ETDEWEB)

    Rozhkov, M.; Nakanishi, K.

    1994-06-01

    This project is oriented toward the application of the mobile seismic array data analysis technique in seismic investigations of the Earth (the noise-array method). The technique falls into the class of emission tomography methods but, in contrast to classic tomography, 3-D images of the microseismic activity of the media are obtained by passive seismic antenna scanning of the half-space, rather than by solution of the inverse Radon`s problem. It is reasonable to expect that areas of geothermal activity, active faults, areas of volcanic tremors and hydrocarbon deposits act as sources of intense internal microseismic activity or as effective sources for scattered (secondary) waves. The conventional approaches of seismic investigations of a geological medium include measurements of time-limited determinate signals from artificial or natural sources. However, the continuous seismic oscillations, like endogenous microseisms, coda and scattering waves, can give very important information about the structure of the Earth. The presence of microseismic sources or inhomogeneities within the Earth results in the appearance of coherent seismic components in a stochastic wave field recorded on the surface by a seismic array. By careful processing of seismic array data, these coherent components can be used to develop a 3-D model of the microseismic activity of the media or images of the noisy objects. Thus, in contrast to classic seismology where narrow windows are used to get the best time resolution of seismic signals, our model requires long record length for the best spatial resolution.

  8. Computer programs for analysis of geophysical data

    International Nuclear Information System (INIS)

    Rozhkov, M.; Nakanishi, K.

    1994-06-01

    This project is oriented toward the application of the mobile seismic array data analysis technique in seismic investigations of the Earth (the noise-array method). The technique falls into the class of emission tomography methods but, in contrast to classic tomography, 3-D images of the microseismic activity of the media are obtained by passive seismic antenna scanning of the half-space, rather than by solution of the inverse Radon's problem. It is reasonable to expect that areas of geothermal activity, active faults, areas of volcanic tremors and hydrocarbon deposits act as sources of intense internal microseismic activity or as effective sources for scattered (secondary) waves. The conventional approaches of seismic investigations of a geological medium include measurements of time-limited determinate signals from artificial or natural sources. However, the continuous seismic oscillations, like endogenous microseisms, coda and scattering waves, can give very important information about the structure of the Earth. The presence of microseismic sources or inhomogeneities within the Earth results in the appearance of coherent seismic components in a stochastic wave field recorded on the surface by a seismic array. By careful processing of seismic array data, these coherent components can be used to develop a 3-D model of the microseismic activity of the media or images of the noisy objects. Thus, in contrast to classic seismology where narrow windows are used to get the best time resolution of seismic signals, our model requires long record length for the best spatial resolution

  9. Analysis of crystallization data in the Protein Data Bank

    International Nuclear Information System (INIS)

    Kirkwood, Jobie; Hargreaves, David; O’Keefe, Simon; Wilson, Julie

    2015-01-01

    In a large-scale study using data from the Protein Data Bank, some of the many reported findings regarding the crystallization of proteins were investigated. The Protein Data Bank (PDB) is the largest available repository of solved protein structures and contains a wealth of information on successful crystallization. Many centres have used their own experimental data to draw conclusions about proteins and the conditions in which they crystallize. Here, data from the PDB were used to reanalyse some of these results. The most successful crystallization reagents were identified, the link between solution pH and the isoelectric point of the protein was investigated and the possibility of predicting whether a protein will crystallize was explored

  10. Analysis of crystallization data in the Protein Data Bank

    Energy Technology Data Exchange (ETDEWEB)

    Kirkwood, Jobie [University of York, York YO10 5DD (United Kingdom); Hargreaves, David [AstraZeneca, Darwin Building, Cambridge Science Park, Cambridge CB4 0WG (United Kingdom); O’Keefe, Simon [University of York, York YO10 5DD (United Kingdom); Wilson, Julie, E-mail: julie.wilson@york.ac.uk [University of York, York YO10 5DD (United Kingdom); University of York, York YO10 5DD (United Kingdom)

    2015-09-23

    In a large-scale study using data from the Protein Data Bank, some of the many reported findings regarding the crystallization of proteins were investigated. The Protein Data Bank (PDB) is the largest available repository of solved protein structures and contains a wealth of information on successful crystallization. Many centres have used their own experimental data to draw conclusions about proteins and the conditions in which they crystallize. Here, data from the PDB were used to reanalyse some of these results. The most successful crystallization reagents were identified, the link between solution pH and the isoelectric point of the protein was investigated and the possibility of predicting whether a protein will crystallize was explored.

  11. Abstract interfaces for data analysis - component architecture for data analysis tools

    International Nuclear Information System (INIS)

    Barrand, G.; Binko, P.; Doenszelmann, M.; Pfeiffer, A.; Johnson, A.

    2001-01-01

    The fast turnover of software technologies, in particular in the domain of interactivity (covering user interface and visualisation), makes it difficult for a small group of people to produce complete and polished software-tools before the underlying technologies make them obsolete. At the HepVis'99 workshop, a working group has been formed to improve the production of software tools for data analysis in HENP. Beside promoting a distributed development organisation, one goal of the group is to systematically design a set of abstract interfaces based on using modern OO analysis and OO design techniques. An initial domain analysis has come up with several categories (components) found in typical data analysis tools: Histograms, Ntuples, Functions, Vectors, Fitter, Plotter, analyzer and Controller. Special emphasis was put on reducing the couplings between the categories to a minimum, thus optimising re-use and maintainability of any component individually. The interfaces have been defined in Java and C++ and implementations exist in the form of libraries and tools using C++ (Anaphe/Lizard, OpenScientist) and Java (Java Analysis Studio). A special implementation aims at accessing the Java libraries (through their Abstract Interfaces) from C++. The authors give an overview of the architecture and design of the various components for data analysis as discussed in AIDA

  12. Gravity Probe B data analysis: II. Science data and their handling prior to the final analysis

    International Nuclear Information System (INIS)

    Silbergleit, A S; Conklin, J W; Heifetz, M I; Holmes, T; Li, J; Mandel, I; Solomonik, V G; Stahl, K; P W Worden Jr; Everitt, C W F; Adams, M; Berberian, J E; Bencze, W; Clarke, B; Al-Jadaan, A; Keiser, G M; Kozaczuk, J A; Al-Meshari, M; Muhlfelder, B; Salomon, M

    2015-01-01

    The results of the Gravity Probe B relativity science mission published in Everitt et al (2011 Phys. Rev. Lett. 106 221101) required a rather sophisticated analysis of experimental data due to several unexpected complications discovered on-orbit. We give a detailed description of the Gravity Probe B data reduction. In the first paper (Silbergleit et al Class. Quantum Grav. 22 224018) we derived the measurement models, i.e., mathematical expressions for all the signals to analyze. In the third paper (Conklin et al Class. Quantum Grav. 22 224020) we explain the estimation algorithms and their program implementation, and discuss the experiment results obtained through data reduction. This paper deals with the science data preparation for the main analysis yielding the relativistic drift estimates. (paper)

  13. Exploring functional data analysis and wavelet principal component analysis on ecstasy (MDMA wastewater data

    Directory of Open Access Journals (Sweden)

    Stefania Salvatore

    2016-07-01

    Full Text Available Abstract Background Wastewater-based epidemiology (WBE is a novel approach in drug use epidemiology which aims to monitor the extent of use of various drugs in a community. In this study, we investigate functional principal component analysis (FPCA as a tool for analysing WBE data and compare it to traditional principal component analysis (PCA and to wavelet principal component analysis (WPCA which is more flexible temporally. Methods We analysed temporal wastewater data from 42 European cities collected daily over one week in March 2013. The main temporal features of ecstasy (MDMA were extracted using FPCA using both Fourier and B-spline basis functions with three different smoothing parameters, along with PCA and WPCA with different mother wavelets and shrinkage rules. The stability of FPCA was explored through bootstrapping and analysis of sensitivity to missing data. Results The first three principal components (PCs, functional principal components (FPCs and wavelet principal components (WPCs explained 87.5-99.6 % of the temporal variation between cities, depending on the choice of basis and smoothing. The extracted temporal features from PCA, FPCA and WPCA were consistent. FPCA using Fourier basis and common-optimal smoothing was the most stable and least sensitive to missing data. Conclusion FPCA is a flexible and analytically tractable method for analysing temporal changes in wastewater data, and is robust to missing data. WPCA did not reveal any rapid temporal changes in the data not captured by FPCA. Overall the results suggest FPCA with Fourier basis functions and common-optimal smoothing parameter as the most accurate approach when analysing WBE data.

  14. Analysis of crystallization data in the Protein Data Bank.

    Science.gov (United States)

    Kirkwood, Jobie; Hargreaves, David; O'Keefe, Simon; Wilson, Julie

    2015-10-01

    The Protein Data Bank (PDB) is the largest available repository of solved protein structures and contains a wealth of information on successful crystallization. Many centres have used their own experimental data to draw conclusions about proteins and the conditions in which they crystallize. Here, data from the PDB were used to reanalyse some of these results. The most successful crystallization reagents were identified, the link between solution pH and the isoelectric point of the protein was investigated and the possibility of predicting whether a protein will crystallize was explored.

  15. Extending the LWS Data Environment: Distributed Data Processing and Analysis

    Science.gov (United States)

    Narock, Thomas

    2005-01-01

    The final stages of this work saw changes to the original framework, as well as the completion and integration of several data processing services. Initially, it was thought that a peer-to-peer architecture was necessary to make this work possible. The peer-to-peer architecture provided many benefits including the dynamic discovery of new services that would be continually added. A prototype example was built and while it showed promise, a major disadvantage was seen in that it was not easily integrated into the existing data environment. While the peer-to-peer system worked well for finding and accessing distributed data processing services, it was found that its use was limited by the difficulty in calling it from existing tools and services. After collaborations with members of the data community, it was determined that our data processing system was of high value and that a new interface should be pursued in order for the community to take full advantage of it. As such; the framework was modified from a peer-to-peer architecture to a more traditional web service approach. Following this change multiple data processing services were added. These services include such things as coordinate transformations and sub setting of data. Observatory (VHO), assisted with integrating the new architecture into the VHO. This allows anyone using the VHO to search for data, to then pass that data through our processing services prior to downloading it. As a second attempt at demonstrating the new system, a collaboration was established with the Collaborative Sun Earth Connector (CoSEC) group at Lockheed Martin. This group is working on a graphical user interface to the Virtual Observatories and data processing software. The intent is to provide a high-level easy-to-use graphical interface that will allow access to the existing Virtual Observatories and data processing services from one convenient application. Working with the CoSEC group we provided access to our data

  16. A Hierarchical Visualization Analysis Model of Power Big Data

    Science.gov (United States)

    Li, Yongjie; Wang, Zheng; Hao, Yang

    2018-01-01

    Based on the conception of integrating VR scene and power big data analysis, a hierarchical visualization analysis model of power big data is proposed, in which levels are designed, targeting at different abstract modules like transaction, engine, computation, control and store. The regularly departed modules of power data storing, data mining and analysis, data visualization are integrated into one platform by this model. It provides a visual analysis solution for the power big data.

  17. Comparative analysis of data mining techniques for business data

    Science.gov (United States)

    Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd

    2014-12-01

    Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.

  18. Tutorial: Asteroseismic Data Analysis with DIAMONDS

    Science.gov (United States)

    Corsaro, Enrico

    Since the advent of the space-based photometric missions such as CoRoT and NASA's Kepler, asteroseismology has acquired a central role in our understanding about stellar physics. The Kepler spacecraft, especially, is still releasing excellent photometric observations that contain a large amount of information not yet investigated. For exploiting the full potential of these data, sophisticated and robust analysis tools are now essential, so that further constraining of stellar structure and evolutionary models can be obtained. In addition, extracting detailed asteroseismic properties for many stars can yield new insights on their correlations to fundamental stellar properties and dynamics. After a brief introduction to the Bayesian notion of probability, I describe the code Diamonds for Bayesian parameter estimation and model comparison by means of the nested sampling Monte Carlo (NSMC) algorithm. NSMC constitutes an efficient and powerful method, in replacement to standard Markov chain Monte Carlo, very suitable for high-dimensional and multimodal problems that are typical of detailed asteroseismic analyses, such as the fitting and mode identification of individual oscillation modes in stars (known as peak-bagging). Diamonds is able to provide robust results for statistical inferences involving tens of individual oscillation modes, while at the same time preserving a considerable computational efficiency for identifying the solution. In the tutorial, I will present the fitting of the stellar background signal and the peak-bagging analysis of the oscillation modes in a red-giant star, providing an example to use Bayesian evidence for assessing the peak significance of the fitted oscillation peaks.

  19. Bayesian Correlation Analysis for Sequence Count Data.

    Directory of Open Access Journals (Sweden)

    Daniel Sánchez-Taltavull

    Full Text Available Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

  20. Data analysis of backscattering LIDAR system correlated with meteorological data

    International Nuclear Information System (INIS)

    Uehara, Sandro Toshio

    2009-01-01

    In these last years, we had an increase in the interest in the monitoring of the effect of the human activity being on the atmosphere and the climate in the planet. The remote sensing techniques has been used in many studies, also related the global changes. A backscattering LIDAR system, the first of this kind in Brazil, has been used to provide the vertical profile of the aerosol backscatter coefficient at 532 nm up to an altitude of 4-6 km above sea level. In this study, data has was collected in the year of 2005. These data had been correlated with data of solar photometer CIMEL and also with meteorological data. The main results had indicated to exist a standard in the behavior of these meteorological data and the vertical distribution of the extinction coefficient gotten through LIDAR. In favorable periods of atmospheric dispersion, that is, rise of the temperature of associated air the fall of relative humidity, increase of the atmospheric pressure and low ventilation tax, was possible to determine with good precision the height of the Planetary Boundary Layer, as much through the vertical profile of the extinction coefficient how much through the technique of the vertical profile of the potential temperature. The technique LIDAR showed to be an important tool in the determination of the thermodynamic structure of the atmosphere, assisting to characterize the evolution of the CLP throughout the day, which had its good space and secular resolution. (author)

  1. Symbolic Data Analysis Conceptual Statistics and Data Mining

    CERN Document Server

    Billard, Lynne

    2012-01-01

    With the advent of computers, very large datasets have become routine. Standard statistical methods don't have the power or flexibility to analyse these efficiently, and extract the required knowledge. An alternative approach is to summarize a large dataset in such a way that the resulting summary dataset is of a manageable size and yet retains as much of the knowledge in the original dataset as possible. One consequence of this is that the data may no longer be formatted as single values, but be represented by lists, intervals, distributions, etc. The summarized data have their own internal s

  2. Science gateways for biomedical big data analysis

    NARCIS (Netherlands)

    Shahand, S.

    2015-01-01

    Biomedical researchers are facing data deluge challenges such as dealing with large volume of complex heterogeneous data and complex and computationally demanding data processing methods. Such scale and complexity of biomedical research requires multi-disciplinary collaboration between scientists

  3. 40 CFR 92.131 - Smoke, data analysis.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 20 2010-07-01 2010-07-01 false Smoke, data analysis. 92.131 Section... analysis. The following procedure shall be used to analyze the smoke test data: (a) Locate each throttle... performed by direct analysis of the recorder traces, or by computer analysis of data collected by automatic...

  4. 40 CFR 86.884-13 - Data analysis.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 19 2010-07-01 2010-07-01 false Data analysis. 86.884-13 Section 86... New Diesel Heavy-Duty Engines; Smoke Exhaust Test Procedure § 86.884-13 Data analysis. The following... linearity check may be performed by direct analysis of the recorder traces, or by computer analysis of data...

  5. Using functional data analysis to analyze ecological series data

    Science.gov (United States)

    Background/Question/MethodsA frequent goal in ecology is to understand the relationships among biological organisms and their environment. Most field data are collected as scalar measurements, such that observations are recorded as a collection of datums. The observations are t...

  6. Data analysis for steam generator tubing samples

    International Nuclear Information System (INIS)

    Dodd, C.V.

    1996-07-01

    The objective of the Improved Eddy-Current ISI for Steam Generators program is to upgrade and validate eddy-current inspections, including probes, instrumentation, and data processing techniques for inservice inspection of new, used, and repaired steam generator tubes; to improve defect detection, classification and characterization as affected by diameter and thickness variations, denting, probe wobble, tube sheet, tube supports, copper and sludge deposits, even when defect types and other variables occur in combination; to transfer this advanced technology to NRC's mobile NDE laboratory and staff. This report provides a description of the application of advanced eddy-current neural network analysis methods for the detection and evaluation of common steam generator tubing flaws including axial and circumferential outer-diameter stress-corrosion cracking and intergranular attack. The report describes the training of the neural networks on tubing samples with known defects and the subsequent evaluation results for unknown samples. Evaluations were done in the presence of artifacts. Computer programs are given in the appendix

  7. Multiresolution Analysis Adapted to Irregularly Spaced Data

    Directory of Open Access Journals (Sweden)

    Anissa Mokraoui

    2009-01-01

    Full Text Available This paper investigates the mathematical background of multiresolution analysis in the specific context where the signal is represented by irregularly sampled data at known locations. The study is related to the construction of nested piecewise polynomial multiresolution spaces represented by their corresponding orthonormal bases. Using simple spline basis orthonormalization procedures involves the construction of a large family of orthonormal spline scaling bases defined on consecutive bounded intervals. However, if no more additional conditions than those coming from multiresolution are imposed on each bounded interval, the orthonormal basis is represented by a set of discontinuous scaling functions. The spline wavelet basis also has the same problem. Moreover, the dimension of the corresponding wavelet basis increases with the spline degree. An appropriate orthonormalization procedure of the basic spline space basis, whatever the degree of the spline, allows us to (i provide continuous scaling and wavelet functions, (ii reduce the number of wavelets to only one, and (iii reduce the complexity of the filter bank. Examples of the multiresolution implementations illustrate that the main important features of the traditional multiresolution are also satisfied.

  8. Spring 2014 Internship Diffuser Data Analysis

    Science.gov (United States)

    Laigaie, Robert T.; Ryan, Harry M.

    2014-01-01

    showed different data dependent on section. Section 1 strains were small, and were in the range of 50 to 150 microstrain, which would result in stresses from 1.45 to 4.35 ksi. The yield stress of the material, A-285 Grade C Steel, is 29.7 ksi. Section 4 strain gages showed much higher values with strains peaking at 1600 microstrain. This strain corresponds to a stress of 46.41 ksi, which is in excess of the yield stress, but below the ultimate stress of 55 to 75 ksi. The decreased accelerations and strain in Section 1, and the increased accelerations and strain in Sections 3 and 4 verified the computer simulation prediction of increased plume oscillations in the lower sections of the diffuser. Hot-Fire Test 2 ran for a duration of 125 seconds. The engine operated at a slightly higher power level than Hot-Fire Test 1 for the initial 35 seconds of the test. After 35 seconds the power level was lowered to Hot-Fire Test 1 levels. The acceleration and strain data for Hot-Fire Test 2 was similar during the initial part of the test. However, just prior to the engine being lowered to the Hot-Fire Test 1 power level, the strain gage data in Section 4 showed a large decrease to strains near zero microstrain from their peak at 1500 microstrain. Future work includes further strain and acceleration data analysis and evaluation.

  9. A Multimodal Data Analysis Approach for Targeted Drug Discovery Involving Topological Data Analysis (TDA).

    Science.gov (United States)

    Alagappan, Muthuraman; Jiang, Dadi; Denko, Nicholas; Koong, Albert C

    In silico drug discovery refers to a combination of computational techniques that augment our ability to discover drug compounds from compound libraries. Many such techniques exist, including virtual high-throughput screening (vHTS), high-throughput screening (HTS), and mechanisms for data storage and querying. However, presently these tools are often used independent of one another. In this chapter, we describe a new multimodal in silico technique for the hit identification and lead generation phases of traditional drug discovery. Our technique leverages the benefits of three independent methods-virtual high-throughput screening, high-throughput screening, and structural fingerprint analysis-by using a fourth technique called topological data analysis (TDA). We describe how a compound library can be independently tested with vHTS, HTS, and fingerprint analysis, and how the results can be transformed into a topological data analysis network to identify compounds from a diverse group of structural families. This process of using TDA or similar clustering methods to identify drug leads is advantageous because it provides a mechanism for choosing structurally diverse compounds while maintaining the unique advantages of already established techniques such as vHTS and HTS.

  10. Randomization Based Privacy Preserving Categorical Data Analysis

    Science.gov (United States)

    Guo, Ling

    2010-01-01

    The success of data mining relies on the availability of high quality data. To ensure quality data mining, effective information sharing between organizations becomes a vital requirement in today's society. Since data mining often involves sensitive information of individuals, the public has expressed a deep concern about their privacy.…

  11. Data needs for common cause failure analysis

    International Nuclear Information System (INIS)

    Parry, G.W.; Paula, H.M.; Rasmuson, D.; Whitehead, D.

    1990-01-01

    The procedures guide for common cause failure analysis published jointly by USNRC and EPRI requires a detailed historical event analysis. Recent work on the further development of the cause-defense picture of common cause failures introduced in that guide identified the information that is necessary to perform the detailed analysis in an objective manner. This paper summarizes these information needs

  12. Telecare service activity analysis using Big Data and Data Mining

    Directory of Open Access Journals (Sweden)

    Alfredo Moreno Muñoz

    2017-01-01

    Full Text Available In the current moment that we are living now, the use of Big Data is taken a strength and a very important relevance. The biggest companies of social sector and service sector are using Big Data technologies that allow to store and treat all the information that they have of users and, in a second way, the incorporation of the knowledge of the treatment of this information in the life of the users, in the way of improve the services offered and go to the next step in the relationship of customer/company. In telecare, with the IP technology in Telecare Unit, the communication between the unit and control centre will be done using internet instead of telephony cable. The companies will start to use these technologies to store all the information that the unit will send to the control center. With all this information, the companies will be able to discover patterns of user’s behavior, detect some illnesses like, for example, alzheimer. The most important action that the companies will be able to have is to have more information related to the situation of all devices and sensors installed in user’s home when the emergency alarm is raised.

  13. Remote Sensing Data Visualization, Fusion and Analysis via Giovanni

    Science.gov (United States)

    Leptoukh, G.; Zubko, V.; Gopalan, A.; Khayat, M.

    2007-01-01

    We describe Giovanni, the NASA Goddard developed online visualization and analysis tool that allows users explore various phenomena without learning remote sensing data formats and downloading voluminous data. Using MODIS aerosol data as an example, we formulate an approach to the data fusion for Giovanni to further enrich online multi-sensor remote sensing data comparison and analysis.

  14. Spectral map-analysis: a method to analyze gene expression data

    OpenAIRE

    Bijnens, Luc J.M.; Lewi, Paul J.; Göhlmann, Hinrich W.; Molenberghs, Geert; Wouters, Luc

    2004-01-01

    bioinformatics; biplot; correspondence factor analysis; data mining; data visualization; gene expression data; microarray data; multivariate exploratory data analysis; principal component analysis; Spectral map analysis

  15. Data formats design of laser irradiation experiments in view of data analysis

    International Nuclear Information System (INIS)

    Su Chunxiao; Yu Xiaoqi; Yang Cunbang; Guo Su; Chen Hongsu

    2002-01-01

    The designing rules of new data file formats of laser irradiation experiments are introduced. Object-oriented programs are designed in studying experimental data of the laser facilities. The new format data files are combinations of the experiment data and diagnostic configuration data, which are applied in data processing and analysis. The edit of diagnostic configuration data in data acquisition program is also described

  16. Spacecraft Interactions Modeling and Post-Mission Data Analysis

    National Research Council Canada - National Science Library

    Bonito, N

    1996-01-01

    Software systems were designed and developed for data management, data acquisition, interactive visualization and analysis of solar arrays, tethered objects, and large object space plasma interactions...

  17. Application of computer intensive data analysis methods to the analysis of digital images and spatial data

    DEFF Research Database (Denmark)

    Windfeld, Kristian

    1992-01-01

    Computer-intensive methods for data analysis in a traditional setting has developed rapidly in the last decade. The application of and adaption of some of these methods to the analysis of multivariate digital images and spatial data are explored, evaluated and compared to well established classical...... into the projection pursuit is presented. Examples from remote sensing are given. The ACE algorithm for computing non-linear transformations for maximizing correlation is extended and applied to obtain a non-linear transformation that maximizes autocorrelation or 'signal' in a multivariate image....... This is a generalization of the minimum /maximum autocorrelation factors (MAF's) which is a linear method. The non-linear method is compared to the linear method when analyzing a multivariate TM image from Greenland. The ACE method is shown to give a more detailed decomposition of the image than the MAF-transformation...

  18. Privacy preserving for Big Data Analysis

    OpenAIRE

    Russom, Yohannes

    2013-01-01

    Master's thesis in Computer Science The Safer@Home [6] project at the University of Stavanger aims to create a smart home system capturing sensor data from homes into it’s data cluster. To provide assistive services through data analytic technologies, sensor data has to be collected centrally in order to effectively perform knowledge discovery algorithms. This Information collected from such homes is often very sensitive in nature and needs to be protected while processing o...

  19. Inspection, visualisation and analysis of quantitative proteomics data

    OpenAIRE

    Gatto, Laurent

    2016-01-01

    Material Quantitative Proteomics and Data Analysis Course. 4 - 5 April 2016, Queen Hotel, Chester, UK Table D - Inspection, visualisation and analysis of quantitative proteomics data, Laurent Gatto (University of Cambridge)

  20. Data analysis method for plant experience feedback data bank

    International Nuclear Information System (INIS)

    Ployart, R.; Lannoy, A.

    1988-01-01

    French pressurized water reactors (PWRs) represent at the moment about fifty units, among which the oldest have been in operation for ten years. Furthermore, these PWRs developed according to a growth strategy of standardized plants and with a single plant operator are quite homogeneous in their design as well as in their operating and maintenance procedures. Lastly, the improvements brought about are usually passed on to the whole of the concerned standardized plant. In this context, the operating plant experience feedback data banks hold valuable information that starts being statistically significant. The reliability oriented methods are rather well known; the ones that enable to read out some information on performance and availability, susceptible to guide the plant operator in the decision making are less tested. It concerns changes of operating or maintenance procedure, or technical changes which could be decided from an assessment of the effects of previous changes, or by observing and explaining a posteriori natural evolutions in the behaviour of components. The method used within the framework of this report leads to reveal and explain singularities, correlations, regroupings and trends in the behaviour of the french PWRs

  1. Analytical fuzzy approach to biological data analysis

    Directory of Open Access Journals (Sweden)

    Weiping Zhang

    2017-03-01

    Full Text Available The assessment of the physiological state of an individual requires an objective evaluation of biological data while taking into account both measurement noise and uncertainties arising from individual factors. We suggest to represent multi-dimensional medical data by means of an optimal fuzzy membership function. A carefully designed data model is introduced in a completely deterministic framework where uncertain variables are characterized by fuzzy membership functions. The study derives the analytical expressions of fuzzy membership functions on variables of the multivariate data model by maximizing the over-uncertainties-averaged-log-membership values of data samples around an initial guess. The analytical solution lends itself to a practical modeling algorithm facilitating the data classification. The experiments performed on the heartbeat interval data of 20 subjects verified that the proposed method is competing alternative to typically used pattern recognition and machine learning algorithms.

  2. Nano-JASMINE Data Analysis and Publication

    Science.gov (United States)

    Yamada, Y.; Hara, T.; Yoshioka, S.; Kobayashi, Y.; Gouda, N.; Miyashita, H.; Hatsutori, Y.; Lammers, U.; Michalik, D.

    2012-09-01

    The core data reduction for the Nano-JASMINE mission is planned to be done with Gaia's Astrometric Global Iterative Solution (AGIS). A collaboration between the Gaia AGIS and Nano-JASMINE teams on the Nano-JASMINE data reduction started in 2007. The Nano-JASMINE team writes codes to generate AGIS input, and this is called Initial Data Treament (IDT). Identification of observed stars and their observed field of view, getting color index, are different from those of Gaia because Nano-JASMINE is ultra small satellite. For converting centroiding results on detector to the celestial sphere, orbit and attitude data of the satellite are used. In Nano-JASMINE, orbit information is derived from on board GPS data and attitude is processed from on-board star sensor data and on-ground Kalman filtering. We also show the Nano-JASMINE goals, status of the data publications and utilizations, and introduce the next Japanese space astrometric mission.

  3. Archiving, Distribution and Analysis of Solar-B Data

    Science.gov (United States)

    Shimojo, M.

    2007-10-01

    The Solar-B Mission Operation and Data Analysis (MODA) working group has been discussing the data analysis system for Solar-B data since 2001. In the paper, based on the Solar-B MODA document and the recent work in Japan, we introduce the dataflow from Solar-B to scientists, the data format and data-level of Solar-B data, and the data searching/providing system.

  4. Leveraging Data Analysis for Domain Experts: An Embeddable Framework for Basic Data Science Tasks

    Science.gov (United States)

    Lohrer, Johannes-Y.; Kaltenthaler, Daniel; Kröger, Peer

    2016-01-01

    In this paper, we describe a framework for data analysis that can be embedded into a base application. Since it is important to analyze the data directly inside the application where the data is entered, a tool that allows the scientists to easily work with their data, supports and motivates the execution of further analysis of their data, which…

  5. Contracting Data Analysis: Assessment of Government-Wide Trends

    Science.gov (United States)

    2017-03-01

    CONTRACTING DATA ANALYSIS Assessment of Government -wide Trends Report to Congressional Addressees March 2017...Office Highlights of GAO-17-244SP, a report to congressional addressees March 2017 CONTRACTING DATA ANALYSIS Assessment of Government -wide...Trends What GAO Found GAO’s analysis of government -wide contracting data found that while defense obligations to buy products and services

  6. Beyond Constant Comparison Qualitative Data Analysis: Using NVivo

    Science.gov (United States)

    Leech, Nancy L.; Onwuegbuzie, Anthony J.

    2011-01-01

    The purposes of this paper are to outline seven types of qualitative data analysis techniques, to present step-by-step guidance for conducting these analyses via a computer-assisted qualitative data analysis software program (i.e., NVivo9), and to present screenshots of the data analysis process. Specifically, the following seven analyses are…

  7. High-Level Overview of Data Needs for RE Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Lopez, Anthony

    2016-12-22

    This presentation provides a high level overview of analysis topics and associated data needs. Types of renewable energy analysis are grouped into two buckets: First, analysis for renewable energy potential, and second, analysis for other goals. Data requirements are similar but and they build upon one another.

  8. Joint data analysis in nutritional epidemiology

    DEFF Research Database (Denmark)

    Pinart, Mariona; Nimptsch, Katharina; Bouwman, Jildau

    2018-01-01

    activity, sedentary behavior, anthropometric measures, and sociodemographic and health status), main health-related outcomes, and laboratory measurements (traditional and omics biomarkers) was developed and circulated to those European research groups participating in the ENPADASI under the strategic...... for the study data were identified to facilitate data integration. Conclusions: Combining study data sets will enable sufficiently powered, refined investigations to increase the knowledge and understanding of the relation between food, nutrition, and human health. Furthermore, the minimal requirements...

  9. Scaling Bulk Data Analysis with Mapreduce

    Science.gov (United States)

    2017-09-01

    significant to the case. One such method, borrowed from text mining techniques, is Term Frequency-InverseDocument Frequency (TF-IDF). TF-IDF is a...distributed digital forensics, data mining , big data 15. NUMBER OF PAGES 133 16. PRICE CODE 17. SECURITY CLASSIFICATION OF REPORT Unclassified 18. SECURITY...images, distributed digital forensics tools and data mining in digital forensics. Our Methodology and Results will be covered in Chapter 4 and Chapter 5

  10. Monitoring and analysis of data in cyberspace

    Science.gov (United States)

    Schwuttke, Ursula M. (Inventor); Angelino, Robert (Inventor)

    2001-01-01

    Information from monitored systems is displayed in three dimensional cyberspace representations defining a virtual universe having three dimensions. Fixed and dynamic data parameter outputs from the monitored systems are visually represented as graphic objects that are positioned in the virtual universe based on relationships to the system and to the data parameter categories. Attributes and values of the data parameters are indicated by manipulating properties of the graphic object such as position, color, shape, and motion.

  11. Identification of noise in linear data sets by factor analysis

    International Nuclear Information System (INIS)

    Roscoe, B.A.; Hopke, Ph.K.

    1982-01-01

    A technique which has the ability to identify bad data points, after the data has been generated, is classical factor analysis. The ability of classical factor analysis to identify two different types of data errors make it ideally suited for scanning large data sets. Since the results yielded by factor analysis indicate correlations between parameters, one must know something about the nature of the data set and the analytical techniques used to obtain it to confidentially isolate errors. (author)

  12. Discriminant analysis of plasma fusion data

    International Nuclear Information System (INIS)

    Kardaun, O.J.W.F.; Kardaun, J.W.P.F.; Itoh, S.; Itoh, K.

    1992-06-01

    Several discriminant analysis methods has been applied and compared to predict the type of ELM's in H-mode discharges: (a) quadratic discriminant analysis (linear discriminant analysis being a special case), (b) discrimination by non-parametric (kernel-) density estimates, and (c) discrimination by a product multinomial model on a discretised scale. Practical evaluation was performed using SAS in the first two cases, and INDEP, a standard FORTRAN program, initially developed for medical applications, in the last case. We give here a flavour of the approach and its results. In summary, discriminant analysis can be used as a useful descriptive method of specifying regions where particular types of plasma discharges can be produced. Parametric methods have the advantage of a rather compact mathematical formulation . Pertinent graphical representations are useful to make the theory and the results more palatable to the experimental physicists. (J.P.N.)

  13. Product Context Analysis with Twitter Data

    OpenAIRE

    Sun, Tao

    2016-01-01

    Context. For the product manager, the product context analysis, which aims to align their products to the market needs, is very important. By understanding the market needs, the product manager knows the product context information about the environment the products conceived and the business the products take place. The product context analysis using the product context information helps the product manager find the accurate position of his/her products and support the decision-making of the...

  14. Visual Analysis of Air Traffic Data

    Science.gov (United States)

    Albrecht, George Hans; Pang, Alex

    2012-01-01

    In this paper, we present visual analysis tools to help study the impact of policy changes on air traffic congestion. The tools support visualization of time-varying air traffic density over an area of interest using different time granularity. We use this visual analysis platform to investigate how changing the aircraft separation volume can reduce congestion while maintaining key safety requirements. The same platform can also be used as a decision aid for processing requests for unmanned aerial vehicle operations.

  15. Data cache organization for accurate timing analysis

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Huber, Benedikt; Puffitsch, Wolfgang

    2013-01-01

    it is important to classify memory accesses as either cache hit or cache miss. The addresses of instruction fetches are known statically and static cache hit/miss classification is possible for the instruction cache. The access to data that is cached in the data cache is harder to predict statically. Several...

  16. Guiding Questions for Data Analysis, by Reports

    Science.gov (United States)

    Wake County Public School System, 2015

    2015-01-01

    This document, which is provided by the Data and Accountability Department staff at Wake County Public School System (WCPSS), is to be used as a resource to help guide the review of student data. This document provides examples of questions to consider when reviewing frequently accessed reports located in Case21, Quickr, EVAAS®, mClass®, or…

  17. Quantitative Data Analysis--In the Graduate Curriculum

    Science.gov (United States)

    Albers, Michael J.

    2017-01-01

    A quantitative research study collects numerical data that must be analyzed to help draw the study's conclusions. Teaching quantitative data analysis is not teaching number crunching, but teaching a way of critical thinking for how to analyze the data. The goal of data analysis is to reveal the underlying patterns, trends, and relationships of a…

  18. Exploring charge density analysis in crystals at high pressure: data collection, data analysis and advanced modelling.

    Science.gov (United States)

    Casati, Nicola; Genoni, Alessandro; Meyer, Benjamin; Krawczuk, Anna; Macchi, Piero

    2017-08-01

    The possibility to determine electron-density distribution in crystals has been an enormous breakthrough, stimulated by a favourable combination of equipment for X-ray and neutron diffraction at low temperature, by the development of simplified, though accurate, electron-density models refined from the experimental data and by the progress in charge density analysis often in combination with theoretical work. Many years after the first successful charge density determination and analysis, scientists face new challenges, for example: (i) determination of the finer details of the electron-density distribution in the atomic cores, (ii) simultaneous refinement of electron charge and spin density or (iii) measuring crystals under perturbation. In this context, the possibility of obtaining experimental charge density at high pressure has recently been demonstrated [Casati et al. (2016). Nat. Commun. 7, 10901]. This paper reports on the necessities and pitfalls of this new challenge, focusing on the species syn-1,6:8,13-biscarbonyl[14]annulene. The experimental requirements, the expected data quality and data corrections are discussed in detail, including warnings about possible shortcomings. At the same time, new modelling techniques are proposed, which could enable specific information to be extracted, from the limited and less accurate observations, like the degree of localization of double bonds, which is fundamental to the scientific case under examination.

  19. Balboa: A Framework for Event-Based Process Data Analysis

    National Research Council Canada - National Science Library

    Cook, Jonathan E; Wolf, Alexander L

    1998-01-01

    .... We have built Balboa as a bridge between the data collection and the analysis tools, facilitating the gathering and management of event data, and simplifying the construction of tools to analyze the data...

  20. Global Surface Warming Hiatus Analysis Data

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — These data were used to conduct the study of the global surface warming hiatus, an apparent decrease in the upward trend of global surface temperatures since 1998....

  1. MARVEL revisited: experiment design and data analysis

    International Nuclear Information System (INIS)

    Thomsen, J.M.

    1978-01-01

    As part of the research performed for the Defense Nuclear Agency program defining the in-trench environment for the MX missile system, the nuclear data base was reviewed to determine if any of it was applicable to the problem of air-shock attenuations in long, shallowly-buried tubes. Because of its unique shock-tube geometry, the LLL MARVEL event, fired in 1967, was judged one of the few relevant nuclear events. The MARVEL event is described, including emplacement geometry and emplacement of instrumentation to measure shock time-of-arrival (TOA) in the shock tube. TOA data from the event, including uncertainties, are discussed and analyzed. Finally, using shock-tube theory, the flow parameters D (shock-tube velocity) and P (peak pressure in the air behind the shock front) are derived from a fit to the MARVEL TOA data. The data derived from those parameters are consistent

  2. Big Data Risk Analysis for Rail Safety?

    OpenAIRE

    Van Gulijk, Coen; Hughes, Peter; Figueres-Esteban, Miguel; Dacre, Marcus; Harrison, Chris; HUD; RSSB

    2015-01-01

    Computer scientists believe that the enormous amounts of data in the internet will unchain a management revolution of uncanny proportions. Yet, to date, the potential benefit of this revolution is scantily investigated for safety and risk management. This paper gives a brief overview of a research programme that investigates how the new internet-driven data-revolution could benefit safety and risk management for railway safety in the UK. The paper gives a brief overview the current activities...

  3. Network data analysis server (NDAS) prototype development

    International Nuclear Information System (INIS)

    Marka, Szabolcs; Mours, Benoit; Williams, Roy

    2002-01-01

    We have developed a simple and robust system based on standard UNIX tools and frame library code to transfer and merge data from multiple gravitational wave detectors distributed worldwide. The transfer and merger take place with less than 20 minute delay and the output frames are available for all participants. Presently VIRGO and LIGO participate in the exchange and only environmental data are shared. The system is modular to allow future improvements and the use of new tools like Grid

  4. Maximum entropy analysis of liquid diffraction data

    International Nuclear Information System (INIS)

    Root, J.H.; Egelstaff, P.A.; Nickel, B.G.

    1986-01-01

    A maximum entropy method for reducing truncation effects in the inverse Fourier transform of structure factor, S(q), to pair correlation function, g(r), is described. The advantages and limitations of the method are explored with the PY hard sphere structure factor as model input data. An example using real data on liquid chlorine, is then presented. It is seen that spurious structure is greatly reduced in comparison to traditional Fourier transform methods. (author)

  5. Abnormal traffic flow data detection based on wavelet analysis

    Directory of Open Access Journals (Sweden)

    Xiao Qian

    2016-01-01

    Full Text Available In view of the traffic flow data of non-stationary, the abnormal data detection is difficult.proposed basing on the wavelet analysis and least squares method of abnormal traffic flow data detection in this paper.First using wavelet analysis to make the traffic flow data of high frequency and low frequency component and separation, and then, combined with least square method to find abnormal points in the reconstructed signal data.Wavelet analysis and least square method, the simulation results show that using wavelet analysis of abnormal traffic flow data detection, effectively reduce the detection results of misjudgment rate and false negative rate.

  6. Regression analysis of sparse asynchronous longitudinal data.

    Science.gov (United States)

    Cao, Hongyuan; Zeng, Donglin; Fine, Jason P

    2015-09-01

    We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.

  7. Analysis of Visual Interpretation of Satellite Data

    Science.gov (United States)

    Svatonova, H.

    2016-06-01

    Millions of people of all ages and expertise are using satellite and aerial data as an important input for their work in many different fields. Satellite data are also gradually finding a new place in education, especially in the fields of geography and in environmental issues. The article presents the results of an extensive research in the area of visual interpretation of image data carried out in the years 2013 - 2015 in the Czech Republic. The research was aimed at comparing the success rate of the interpretation of satellite data in relation to a) the substrates (to the selected colourfulness, the type of depicted landscape or special elements in the landscape) and b) to selected characteristics of users (expertise, gender, age). The results of the research showed that (1) false colour images have a slightly higher percentage of successful interpretation than natural colour images, (2) colourfulness of an element expected or rehearsed by the user (regardless of the real natural colour) increases the success rate of identifying the element (3) experts are faster in interpreting visual data than non-experts, with the same degree of accuracy of solving the task, and (4) men and women are equally successful in the interpretation of visual image data.

  8. ANALYSIS OF VISUAL INTERPRETATION OF SATELLITE DATA

    Directory of Open Access Journals (Sweden)

    H. Svatonova

    2016-06-01

    Full Text Available Millions of people of all ages and expertise are using satellite and aerial data as an important input for their work in many different fields. Satellite data are also gradually finding a new place in education, especially in the fields of geography and in environmental issues. The article presents the results of an extensive research in the area of visual interpretation of image data carried out in the years 2013 - 2015 in the Czech Republic. The research was aimed at comparing the success rate of the interpretation of satellite data in relation to a the substrates (to the selected colourfulness, the type of depicted landscape or special elements in the landscape and b to selected characteristics of users (expertise, gender, age. The results of the research showed that (1 false colour images have a slightly higher percentage of successful interpretation than natural colour images, (2 colourfulness of an element expected or rehearsed by the user (regardless of the real natural colour increases the success rate of identifying the element (3 experts are faster in interpreting visual data than non-experts, with the same degree of accuracy of solving the task, and (4 men and women are equally successful in the interpretation of visual image data.

  9. Enhancing yeast transcription analysis through integration of heterogeneous data

    DEFF Research Database (Denmark)

    Grotkjær, Thomas; Nielsen, Jens

    2004-01-01

    of Saccharomyces cerevisiae whole genome transcription data. A special focus is on the quantitative aspects of normalisation and mathematical modelling approaches, since they are expected to play an increasing role in future DNA microarray analysis studies. Data analysis is exemplified with cluster analysis......DNA microarray technology enables the simultaneous measurement of the transcript level of thousands of genes. Primary analysis can be done with basic statistical tools and cluster analysis, but effective and in depth analysis of the vast amount of transcription data requires integration with data...... from several heterogeneous data Sources, such as upstream promoter sequences, genome-scale metabolic models, annotation databases and other experimental data. In this review, we discuss how experimental design, normalisation, heterogeneous data and mathematical modelling can enhance analysis...

  10. Privacy protected text analysis in DataSHIELD

    Directory of Open Access Journals (Sweden)

    Rebecca Wilson

    2017-04-01

    Whilst it is possible to analyse free text within a DataSHIELD infrastructure, the challenge is creating generalised and resilient anti-disclosure methods for free text analysis. There are a range of biomedical and health sciences applications for DataSHIELD methods of privacy protected analysis of free text including analysis of electronic health records and analysis of qualitative data e.g. from social media.

  11. Microsatellite data analysis for population genetics

    Science.gov (United States)

    Theories and analytical tools of population genetics have been widely applied for addressing various questions in the fields of ecological genetics, conservation biology, and any context where the role of dispersal or gene flow is important. Underlying much of population genetics is the analysis of ...

  12. Techniques and Applications of Urban Data Analysis

    KAUST Repository

    AlHalawani, Sawsan

    2016-01-01

    Digitization and characterization of urban spaces are essential components as we move to an ever-growing ’always connected’ world. Accurate analysis of such digital urban spaces has become more important as we continue to get spatial and social

  13. The Analysis of Polyploid Genetic Data

    NARCIS (Netherlands)

    Meirmans, P.G.; Liu, S.; van Tienderen, P.H.

    2018-01-01

    Though polyploidy is an important aspect of the evolutionary genetics of both plants and animals, the development of population genetic theory of polyploids has seriously lagged behind that of diploids. This is unfortunate since the analysis of polyploid genetic data—and the interpretation of the

  14. Shuttle Topography Data Inform Solar Power Analysis

    Science.gov (United States)

    2013-01-01

    The next time you flip on a light switch, there s a chance that you could be benefitting from data originally acquired during the Space Shuttle Program. An effort spearheaded by Jet Propulsion Laboratory (JPL) and the National Geospatial-Intelligence Agency (NGA) in 2000 put together the first near-global elevation map of the Earth ever assembled, which has found use in everything from 3D terrain maps to models that inform solar power production. For the project, called the Shuttle Radar Topography Mission (SRTM), engineers at JPL designed a 60-meter mast that was fitted onto Shuttle Endeavour. Once deployed in space, an antenna attached to the end of the mast worked in combination with another antenna on the shuttle to simultaneously collect data from two perspectives. Just as having two eyes makes depth perception possible, the SRTM data sets could be combined to form an accurate picture of the Earth s surface elevations, the first hight-detail, near-global elevation map ever assembled. What made SRTM unique was not just its surface mapping capabilities but the completeness of the data it acquired. Over the course of 11 days, the shuttle orbited the Earth nearly 180 times, covering everything between the 60deg north and 54deg south latitudes, or roughly 80 percent of the world s total landmass. Of that targeted land area, 95 percent was mapped at least twice, and 24 percent was mapped at least four times. Following several years of processing, NASA released the data to the public in partnership with NGA. Robert Crippen, a member of the SRTM science team, says that the data have proven useful in a variety of fields. "Satellites have produced vast amounts of remote sensing data, which over the years have been mostly two-dimensional. But the Earth s surface is three-dimensional. Detailed topographic data give us the means to visualize and analyze remote sensing data in their natural three-dimensional structure, facilitating a greater understanding of the features

  15. Integration of video and radiation analysis data

    International Nuclear Information System (INIS)

    Menlove, H.O.; Howell, J.A.; Rodriguez, C.A.; Eccleston, G.W.; Beddingfield, D.; Smith, J.E.; Baumgart, C.W.

    1995-01-01

    For the past several years, the integration of containment and surveillance (C/S) with nondestructive assay (NDA) sensors for monitoring the movement of nuclear material has focused on the hardware and communications protocols in the transmission network. Little progress has been made in methods to utilize the combined C/S and NDA data for safeguards and to reduce the inspector time spent in nuclear facilities. One of the fundamental problems in the integration of the combined data is that the two methods operate in different dimensions. The C/S video data is spatial in nature; whereas, the NDA sensors provide radiation levels versus time data. The authors have introduced a new method to integrate spatial (digital video) with time (radiation monitoring) information. This technology is based on pattern recognition by neural networks, provides significant capability to analyze complex data, and has the ability to learn and adapt to changing situations. This technique has the potential of significantly reducing the frequency of inspection visits to key facilities without a loss of safeguards effectiveness

  16. Information, Privacy and Stability in Adaptive Data Analysis

    OpenAIRE

    Smith, Adam

    2017-01-01

    Traditional statistical theory assumes that the analysis to be performed on a given data set is selected independently of the data themselves. This assumption breaks downs when data are re-used across analyses and the analysis to be performed at a given stage depends on the results of earlier stages. Such dependency can arise when the same data are used by several scientific studies, or when a single analysis consists of multiple stages. How can we draw statistically valid conclusions when da...

  17. Refueling outage data collection and analysis

    International Nuclear Information System (INIS)

    Harshaw, K.; Quilliam, J.; Brinsfield, W.; Jeffries, J.

    1993-07-01

    This report summarizes the results of an EPRI project to compile an industry generic refueling outage database applicable to alternate (non-full-power) modes of shutdown conditions at nuclear power plants. The project team evaluated five outages at two BWR plants. They obtained data primarily from control room logs, outage schedules, incident reports, and licensee event reports. The team organized the data by outage segment and time line. Due to its small sample size, this study produced no conclusive results related to initiating event frequencies, equipment failure rates, or human reliability estimates during shutdown conditions. However, it pointed out the problems of brief or inconsistent recordkeeping. A too brief record results in difficulty determining if the root cause of an event was mechanical or the result of human performance. Retrieval of data can be difficult and labor-intensive. There is a clear need for better, more comprehensive documentation

  18. Analysis of the Westland Data Set

    Science.gov (United States)

    Wen, Fang; Willett, Peter; Deb, Somnath

    2001-01-01

    The "Westland" set of empirical accelerometer helicopter data with seeded and labeled faults is analyzed with the aim of condition monitoring. The autoregressive (AR) coefficients from a simple linear model encapsulate a great deal of information in a relatively few measurements; and it has also been found that augmentation of these by harmonic and other parameters call improve classification significantly. Several techniques have been explored, among these restricted Coulomb energy (RCE) networks, learning vector quantization (LVQ), Gaussian mixture classifiers and decision trees. A problem with these approaches, and in common with many classification paradigms, is that augmentation of the feature dimension can degrade classification ability. Thus, we also introduce the Bayesian data reduction algorithm (BDRA), which imposes a Dirichlet prior oil training data and is thus able to quantify probability of error in all exact manner, such that features may be discarded or coarsened appropriately.

  19. Skyshine analysis using various nuclear data files

    Energy Technology Data Exchange (ETDEWEB)

    Zharkov, V.P.; Dikareva, O.F.; Kartashev, I.A.; Kiselev, A.N. [Research and Development Inst. of Power Engineering, Moscow (Russian Federation); Nomura, Y.; Tsubosaka, A. [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan)

    2000-03-01

    The calculations of the spacial distributions of dose rate for neutron and secondary photons, thermal neutron fluxes and space-energy distributions of neutron and photons near the air-ground interface were performed by MCNP and DORT codes. Different nuclear data files were used (ENDF/B-IV, ENDF/B-VI, FENDL-2, JENDL-3.2). Either the standard pointwise libraries (MCNP) or special libraries prepared by NJOY code from ENDF/B and others' files were used. Prepared multigroup coupled neutron and photon cross sections libraries for DORT code had CASK-40 group energy structures. The libraries contain pointwise or multigroup cross sections data for all elements included in the atmosphere and ground composition. The validation of the calculated results was performed with using the experimental data obtained for the series of measurements at RA reactor. (author)

  20. Tidal analysis of Met rocket wind data

    Science.gov (United States)

    Bedinger, J. F.; Constantinides, E.

    1976-01-01

    A method of analyzing Met Rocket wind data is described. Modern tidal theory and specialized analytical techniques were used to resolve specific tidal modes and prevailing components in observed wind data. A representation of the wind which is continuous in both space and time was formulated. Such a representation allows direct comparison with theory, allows the derivation of other quantities such as temperature and pressure which in turn may be compared with observed values, and allows the formation of a wind model which extends over a broader range of space and time. Significant diurnal tidal modes with wavelengths of 10 and 7 km were present in the data and were resolved by the analytical technique.

  1. International data collection and analysis. Task 1

    Energy Technology Data Exchange (ETDEWEB)

    Fox, J.B.; Stobbs, J.J.; Collier, D.M.; Hobbs, J.S.

    1979-04-01

    Commercial nuclear power has grown to the point where 13 nations now operate commercial nuclear power plants. Another four countries should join this list before the end of 1980. In the Nonproliferation Alternative Systems Assessment Program (NASAP), the US DOE is evaluating a series of alternate possible power systems. The objective is to determine practical nuclear systems which could reduce proliferation risk while still maintaining the benefits of nuclear power. Part of that effort is the development of a data base denoting the energy needs, resources, technical capabilities, commitment to nuclear power, and projected future trends for various non-US countries. The data are presented by country for each of 28 non-US counries. Data are compiled in this volume on Canada, Egypt, Federal Republic of Germany, Finland, and France.

  2. International data collection and analysis. Task 1

    Energy Technology Data Exchange (ETDEWEB)

    1979-04-01

    Commercial nuclear power has grown to the point where 13 nations now operate commercial nuclear power plants. Another four countries should join this list before the end of 1980. In the Nonproliferation Alternative Systems Assessment Program (NASAP), the US DOE is evaluating a series of alternate possible power systems. The objective is to determine practical nuclear systems which could reduce proliferation risk while still maintaining the benefits of nuclear power. Part of that effort is the development of a data base denoting the energy needs, resources, technical capabilities, commitment to nuclear power, and projected future trends for various non-US countries. The data are presented by country for each of 28 non-US countries. This volume contains compiled data on Mexico, Netherlands, Pakistan, Philippines, South Africa, South Korea, and Spain.

  3. Skyshine analysis using various nuclear data files

    International Nuclear Information System (INIS)

    Zharkov, V.P.; Dikareva, O.F.; Kartashev, I.A.; Kiselev, A.N.; Nomura, Y.; Tsubosaka, A.

    2000-01-01

    The calculations of the spacial distributions of dose rate for neutron and secondary photons, thermal neutron fluxes and space-energy distributions of neutron and photons near the air-ground interface were performed by MCNP and DORT codes. Different nuclear data files were used (ENDF/B-IV, ENDF/B-VI, FENDL-2, JENDL-3.2). Either the standard pointwise libraries (MCNP) or special libraries prepared by NJOY code from ENDF/B and others' files were used. Prepared multigroup coupled neutron and photon cross sections libraries for DORT code had CASK-40 group energy structures. The libraries contain pointwise or multigroup cross sections data for all elements included in the atmosphere and ground composition. The validation of the calculated results was performed with using the experimental data obtained for the series of measurements at RA reactor. (author)

  4. Covariate analysis of bivariate survival data

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, L.E.

    1992-01-01

    The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methods have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.

  5. Analysis of integrated video and radiation data

    International Nuclear Information System (INIS)

    Howell, J.A.; Menlove, H.O.; Rodriguez, C.A.; Beddingfield, D.; Vasil, A.

    1995-01-01

    We have developed prototype software for a facility-monitoring application that will detect anomalous activity in a nuclear facility. The software, which forms the basis of a simple model, automatically reviews and analyzes integrated safeguards data from continuous unattended monitoring systems. This technology, based on pattern recognition by neural networks, provides significant capability to analyze complex data and has the ability to learn and adapt to changing situations. It is well suited for large automated facilities, reactors, spent-fuel storage facilities, reprocessing plants, and nuclear material storage vaults

  6. Meteorological Data Analysis Using MapReduce

    Directory of Open Access Journals (Sweden)

    Wei Fang

    2014-01-01

    Full Text Available In the atmospheric science, the scale of meteorological data is massive and growing rapidly. K-means is a fast and available cluster algorithm which has been used in many fields. However, for the large-scale meteorological data, the traditional K-means algorithm is not capable enough to satisfy the actual application needs efficiently. This paper proposes an improved MK-means algorithm (MK-means based on MapReduce according to characteristics of large meteorological datasets. The experimental results show that MK-means has more computing ability and scalability.

  7. An analysis of environmental data transmission

    Science.gov (United States)

    Yuan, Lina; Chen, Huajun; Gong, Jing

    2017-05-01

    To comprehensively construct environmental automatic monitoring has become the urgent need of environmental management, is a major measure to implement the scientific outlook on development and build a harmonious socialist society, and is an inevitable choice of “building a resource-conserving and environment-friendly society”, which is of great importance and profound significance to adjust the economic structure and transform growth pattern. This article first introduces the importance of environmental data transmission, then expounds the characteristics, key technologies, transmitting mode, and design ideas of environmental data transmission process, and finally, summarizes the full text.

  8. Analysis of RAE-B attitude data

    Science.gov (United States)

    Hedland, D. A.; Degonia, P. K.

    1975-01-01

    Attempts made to obtain a description of the in-orbit dynamic behavior of the RAE-B spacecraft and account for the discrepancies between predicted and actual in-orbit performance are reported. In particular, attitude dynamics during the final despin operations in lunar orbit, throughout all deployment operations, and into the final steady state mission mode were investigated. Attempts made to match computer simulation results to the observed equilibrium data are discussed. Due to a damaged antenna boom and the unavailability of sufficient attitude and dynamics data, most of the objectives were not realized.

  9. Generalised recurrence plot analysis for spatial data

    International Nuclear Information System (INIS)

    Marwan, Norbert; Kurths, Juergen; Saparin, Peter

    2007-01-01

    Recurrence plot based methods are highly efficient and widely accepted tools for the investigation of time series or one-dimensional data. We present an extension of the recurrence plots and their quantifications in order to study recurrent structures in higher-dimensional spatial data. The capability of this extension is illustrated on prototypical 2D models. Next, the tested and proved approach is applied to assess the bone structure from CT images of human proximal tibia. We find that the spatial structures in trabecular bone become more recurrent during the bone loss in osteoporosis

  10. Statistical analysis of dragline monitoring data

    Energy Technology Data Exchange (ETDEWEB)

    Mirabediny, H.; Baafi, E.Y. [University of Tehran, Tehran (Iran)

    1998-07-01

    Dragline monitoring systems are normally the best tool used to collect data on the machine performance and operational parameters of a dragline operation. This paper discusses results of a time study using data from a dragline monitoring system captured over a four month period. Statistical summaries of the time study in terms of average values, standard deviation and frequency distributions showed that the mode of operation and the geological conditions have a significant influence on the dragline performance parameters. 6 refs., 14 figs., 3 tabs.

  11. [Preliminarily application of content analysis to qualitative nursing data].

    Science.gov (United States)

    Liang, Shu-Yuan; Chuang, Yeu-Hui; Wu, Shu-Fang

    2012-10-01

    Content analysis is a methodology for objectively and systematically studying the content of communication in various formats. Content analysis in nursing research and nursing education is called qualitative content analysis. Qualitative content analysis is frequently applied to nursing research, as it allows researchers to determine categories inductively and deductively. This article examines qualitative content analysis in nursing research from theoretical and practical perspectives. We first describe how content analysis concepts such as unit of analysis, meaning unit, code, category, and theme are used. Next, we describe the basic steps involved in using content analysis, including data preparation, data familiarization, analysis unit identification, creating tentative coding categories, category refinement, and establishing category integrity. Finally, this paper introduces the concept of content analysis rigor, including dependability, confirmability, credibility, and transferability. This article elucidates the content analysis method in order to help professionals conduct systematic research that generates data that are informative and useful in practical application.

  12. Diazo techniques for remote sensor data analysis

    Science.gov (United States)

    Mount, S.; Whitebay, L. E.

    1979-01-01

    Cost and time to extract land use maps, natural-resource surveys, and other data from aerial and satellite photographs are reduced by diazo processing. Process can be controlled to enhance features such as vegetation, land boundaries, and bodies of water.

  13. Deep learning for power system data analysis

    NARCIS (Netherlands)

    Mocanu, Elena; Nguyen, Phuong H.; Gibescu, Madeleine; Arghandeh, R.; Zhou, Y.

    2017-01-01

    Unprecedented high volumes of data are available in the smart grid context, facilitated by the growth of home energy management systems and advanced metering infrastructure. In order to automatically extract knowledge from, and take advantage of this useful information to improve grid operation,

  14. A Vehicle for Bivariate Data Analysis

    Science.gov (United States)

    Roscoe, Matt B.

    2016-01-01

    Instead of reserving the study of probability and statistics for special fourth-year high school courses, the Common Core State Standards for Mathematics (CCSSM) takes a "statistics for all" approach. The standards recommend that students in grades 6-8 learn to summarize and describe data distributions, understand probability, draw…

  15. Data Analysis and Next Generation Assessments

    Science.gov (United States)

    Pon, Kathy

    2013-01-01

    For the last decade, much of the work of California school administrators has been shaped by the accountability of the No Child Left Behind Act. Now as they stand at the precipice of Common Core Standards and next generation assessments, it is important to reflect on the proficiency educators have attained in using data to improve instruction and…

  16. ggplot2 elegant graphics for data analysis

    CERN Document Server

    Wickham, Hadley

    2016-01-01

    This new edition to the classic book by ggplot2 creator Hadley Wickham highlights compatibility with knitr and RStudio. ggplot2 is a data visualization package for R that helps users create data graphics, including those that are multi-layered, with ease. With ggplot2, it's easy to: • produce handsome, publication-quality plots with automatic legends created from the plot specification • superimpose multiple layers (points, lines, maps, tiles, box plots) from different data sources with automatically adjusted common scales • add customizable smoothers that use powerful modeling capabilities of R, such as loess, linear models, generalized additive models, and robust regression • save any ggplot2 plot (or part thereof) for later modification or reuse • create custom themes that capture in-house or journal style requirements and that can easily be applied to multiple plots • approach a graph from a visual perspective, thinking about how each component of the data is represented on the final plot This...

  17. Critical analysis of adsorption data statistically

    Science.gov (United States)

    Kaushal, Achla; Singh, S. K.

    2017-10-01

    Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are mango leaf powder.

  18. An interactive parallel processor for data analysis

    International Nuclear Information System (INIS)

    Mong, J.; Logan, D.; Maples, C.; Rathbun, W.; Weaver, D.

    1984-01-01

    A parallel array of eight minicomputers has been assembled in an attempt to deal with kiloparameter data events. By exporting computer system functions to a separate processor, the authors have been able to achieve computer amplification linearly proportional to the number of executing processors

  19. Mobile Monitoring Data Processing and Analysis Strategies

    Science.gov (United States)

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  20. Mobile Monitoring Data Processing & Analysis Strategies

    Science.gov (United States)

    The development of portable, high-time resolution instruments for measuring the concentrations of a variety of air pollutants has made it possible to collect data while in motion. This strategy, known as mobile monitoring, involves mounting air sensors on variety of different pla...

  1. Shrinkage Approach for Gene Expression Data Analysis

    Czech Academy of Sciences Publication Activity Database

    Haman, Jiří; Valenta, Zdeněk

    2013-01-01

    Roč. 9, č. 3 (2013), s. 2-8 ISSN 1801-5603 Grant - others:UK(CZ) SVV-2013-266517 Institutional support: RVO:67985807 Keywords : microarray technology * high dimensional data * mean squared error * James-Stein shrinkage estimator * mutual information Subject RIV: IN - Informatics, Computer Science http://www.ejbi.org/img/ejbi/2013/3/Haman_en.pdf

  2. Analysis of the real EADGENE data set:

    DEFF Research Database (Denmark)

    Sørensen, Peter; Bonnet, Agnès; Buitenhuis, Bart

    2007-01-01

    The aim of this paper was to describe, and when possible compare, the multivariate methods used by the participants in the EADGENE WP1.4 workshop. The first approach was for class discovery and class prediction using evidence from the data at hand. Several teams used hierarchical clustering (HC) ...

  3. Linux Incident Response Volatile Data Analysis Framework

    Science.gov (United States)

    McFadden, Matthew

    2013-01-01

    Cyber incident response is an emphasized subject area in cybersecurity in information technology with increased need for the protection of data. Due to ongoing threats, cybersecurity imposes many challenges and requires new investigative response techniques. In this study a Linux Incident Response Framework is designed for collecting volatile data…

  4. Shrinkage Approach for Gene Expression Data Analysis

    Czech Academy of Sciences Publication Activity Database

    Haman, Jiří; Valenta, Zdeněk; Kalina, Jan

    2013-01-01

    Roč. 1, č. 1 (2013), s. 65-65 ISSN 1805-8698. [EFMI 2013 Special Topic Conference. 17.04.2013-19.04.2013, Prague] Institutional support: RVO:67985807 Keywords : shrinkage estimation * covariance matrix * high dimensional data * gene expression Subject RIV: IN - Informatics, Computer Science

  5. Einstein Slew Survey: Data analysis innovations

    Science.gov (United States)

    Elvis, Martin S.; Plummer, David; Schachter, Jonathan F.; Fabbiano, G.

    1992-01-01

    Several new methods were needed in order to make the Einstein Slew X-ray Sky Survey. The innovations which enabled the Slew Survey to be done are summarized. These methods included experimental approach to large projects, parallel processing on a LAN, percolation source detection, minimum action identifications, and rapid dissemination of the whole data base.

  6. Approaches to data analysis of multiple-choice questions

    OpenAIRE

    Lin Ding; Robert Beichner

    2009-01-01

    This paper introduces five commonly used approaches to analyzing multiple-choice test data. They are classical test theory, factor analysis, cluster analysis, item response theory, and model analysis. Brief descriptions of the goals and algorithms of these approaches are provided, together with examples illustrating their applications in physics education research. We minimize mathematics, instead placing emphasis on data interpretation using these approaches.

  7. Analysis of slifer data from underground explosions

    International Nuclear Information System (INIS)

    Heusinkveld, M.

    1979-01-01

    A formula describing the distance--time relationship of a shock wave moving outward from an underground explosion is derived. Calculated results are compared with those computed using the LASL and BOTE formulas and with slifer data obtained from field experiments. For many of the field events, the derived curve provides a better fit than do the LASL or BOTE formulas. Methods are presented for the determination of the detonation energy W under three conditions: (a) where time and distance are known accurately; (b) where there is an unknown offset of time and distance; and (c) where there is an unknown offset of both time and distance. These methods are applied with moderate success to a set of (t,r) data supplied by Goldwire

  8. [The meta-analysis of data from individual patients].

    NARCIS (Netherlands)

    Rovers, M.M.; Reitsma, J.B.

    2012-01-01

    - An IPD (Individual Participant Data) meta-analysis requires collecting original individual patient data and calculating an estimated effect based on these data.- The use of individual patient data has various advantages: the original data and the results of published analyses are verified,

  9. μSR data analysis with XYfit

    International Nuclear Information System (INIS)

    Tucakov, Ivan; Brewer, Jess H.; Froese, Aaron; Lau, Angus

    2006-01-01

    The CERN MINUIT library has recently been ported to C++, facilitating the incorporation of that familiar package into a general purpose fitting application called XYfit with a graphical user interface built with Qt3. Custom theory functions can easily be added as 'plugins' to XYfit, which reads in DataTables in XML format generated by the μView spreadsheet utility (which see), thus allowing more flexible fitting of μSR spectra

  10. Financial Ratio Analysis using ARMS Data

    OpenAIRE

    Ahrendsen, Bruce L.; Katchova, Ani L.

    2012-01-01

    The purpose of this research is to evaluate the financial performance measures calculated and reported by Economic Resource Service (ERS) from ARMS data. The evaluation includes the calculation method and the underlying assumptions used in obtaining the reported values. The financial measures calculated and reported are compared with those recommended by the Farm Financial Standards Council (FFSC). The underlying assumptions are identified by analyzing the software code used in calculating th...

  11. AGC 2 Irradiation Creep Strain Data Analysis

    International Nuclear Information System (INIS)

    Windes, William E.; Rohrbaugh, David T.; Swank, W. David

    2016-01-01

    The Advanced Reactor Technologies Graphite Research and Development Program is conducting an extensive graphite irradiation experiment to provide data for licensing of a high temperature reactor (HTR) design. In past applications, graphite has been used effectively as a structural and moderator material in both research and commercial high temperature gas cooled reactor designs. Nuclear graphite H-451, used previously in the United States for nuclear reactor graphite components, is no longer available. New nuclear graphite grades have been developed and are considered suitable candidates for new HTR reactor designs. To support the design and licensing of HTR core components within a commercial reactor, a complete properties database must be developed for these current grades of graphite. Quantitative data on in service material performance are required for the physical, mechanical, and thermal properties of each graphite grade, with a specific emphasis on data accounting for the life limiting effects of irradiation creep on key physical properties of the HTR candidate graphite grades. Further details on the research and development activities and associated rationale required to qualify nuclear grade graphite for use within the HTR are documented in the graphite technology research and development plan.

  12. Analysis of scaled-factorial-moment data

    International Nuclear Information System (INIS)

    Seibert, D.

    1990-01-01

    We discuss the two standard constructions used in the search for intermittency, the exclusive and inclusive scaled factorial moments. We propose the use of a new scaled factorial moment that reduces to the exclusive moment in the appropriate limit and is free of undesirable multiplicity correlations that are contained in the inclusive moment. We show that there are some similarities among most of the models that have been proposed to explain factorial-moment data, and that these similarities can be used to increase the efficiency of testing these models. We begin by calculating factorial moments from a simple independent-cluster model that assumes only approximate boost invariance of the cluster rapidity distribution and an approximate relation among the moments of the cluster multiplicity distribution. We find two scaling laws that are essentially model independent. The first scaling law relates the moments to each other with a simple formula, indicating that the different factorial moments are not independent. The second scaling law relates samples with different rapidity densities. We find evidence for much larger clusters in heavy-ion data than in light-ion data, indicating possible spatial intermittency in the heavy-ion events

  13. AGC 2 Irradiation Creep Strain Data Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Windes, William E. [Idaho National Lab. (INL), Idaho Falls, ID (United States); Rohrbaugh, David T. [Idaho National Lab. (INL), Idaho Falls, ID (United States); Swank, W. David [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2016-08-01

    The Advanced Reactor Technologies Graphite Research and Development Program is conducting an extensive graphite irradiation experiment to provide data for licensing of a high temperature reactor (HTR) design. In past applications, graphite has been used effectively as a structural and moderator material in both research and commercial high temperature gas cooled reactor designs. Nuclear graphite H-451, used previously in the United States for nuclear reactor graphite components, is no longer available. New nuclear graphite grades have been developed and are considered suitable candidates for new HTR reactor designs. To support the design and licensing of HTR core components within a commercial reactor, a complete properties database must be developed for these current grades of graphite. Quantitative data on in service material performance are required for the physical, mechanical, and thermal properties of each graphite grade, with a specific emphasis on data accounting for the life limiting effects of irradiation creep on key physical properties of the HTR candidate graphite grades. Further details on the research and development activities and associated rationale required to qualify nuclear grade graphite for use within the HTR are documented in the graphite technology research and development plan.

  14. Uncertainty analysis with statistically correlated failure data

    International Nuclear Information System (INIS)

    Modarres, M.; Dezfuli, H.; Roush, M.L.

    1987-01-01

    Likelihood of occurrence of the top event of a fault tree or sequences of an event tree is estimated from the failure probability of components that constitute the events of the fault/event tree. Component failure probabilities are subject to statistical uncertainties. In addition, there are cases where the failure data are statistically correlated. At present most fault tree calculations are based on uncorrelated component failure data. This chapter describes a methodology for assessing the probability intervals for the top event failure probability of fault trees or frequency of occurrence of event tree sequences when event failure data are statistically correlated. To estimate mean and variance of the top event, a second-order system moment method is presented through Taylor series expansion, which provides an alternative to the normally used Monte Carlo method. For cases where component failure probabilities are statistically correlated, the Taylor expansion terms are treated properly. Moment matching technique is used to obtain the probability distribution function of the top event through fitting the Johnson Ssub(B) distribution. The computer program, CORRELATE, was developed to perform the calculations necessary for the implementation of the method developed. (author)

  15. Conducting Qualitative Data Analysis: Managing Dynamic Tensions within

    Science.gov (United States)

    Chenail, Ronald J.

    2012-01-01

    In the third of a series of "how-to" essays on conducting qualitative data analysis, Ron Chenail examines the dynamic tensions within the process of qualitative data analysis that qualitative researchers must manage in order to produce credible and creative results. These tensions include (a) the qualities of the data and the qualitative data…

  16. QUALITATIVE DATA AND ERROR MEASUREMENT IN INPUT-OUTPUT-ANALYSIS

    NARCIS (Netherlands)

    NIJKAMP, P; OOSTERHAVEN, J; OUWERSLOOT, H; RIETVELD, P

    1992-01-01

    This paper is a contribution to the rapidly emerging field of qualitative data analysis in economics. Ordinal data techniques and error measurement in input-output analysis are here combined in order to test the reliability of a low level of measurement and precision of data by means of a stochastic

  17. National dam inventory provides data for analysis

    International Nuclear Information System (INIS)

    Spragens, L.

    1992-01-01

    The Association of State Dam Safety Officials completed a dam inventory this fall. Information on approximately 90,000 state-regulated dams in the US collected during the four-year inventory is being used to build a database managed by the Federal Emergency Management Agency. In addition to ASDSO's work, the federal government conducted an inventory of federal dams. This data will be added to the state information to form one national database. The database will feature 35 data fields for each entry, including the name of the dam, its size, the name of the nearest downstream community, maximum discharge and storage volume, the date of the last inspection, and details about the emergency action plan. The program is an update of the nation's first dam inventory, required by the Dam Safety Act of 1972. The US Army Corps of Engineers completed the original inventory in 1981. The Water Resources Development Act of 1986 authorized appropriations of $2.5 million for the Corps to update the inventory. FEMA and the Corps entered into an agreement for FEMA to undertake the task for the Corps and to coordinate work on both the federal and state inventories. ASDSO compiles existing information on state-regulated dams into a common format for the database, added missing information, and established a process for continually updating data. ASDSO plans to analyze the information collected for the database. It will look at statistics for the number of dams regulated, communities that could be affected, and the number of high-hazard dams. FEMA is preparing reports for Congress on the project. The reports, which are expected to be ready by May 1993, will include information on the methodology used and facts about regulated dams under state jurisdiction

  18. Kinetic analysis of dynamic PET data

    Energy Technology Data Exchange (ETDEWEB)

    Knittel, B.

    1983-12-01

    Our goal is to quantify regional physiological processes such as blood flow and metabolism by means of tracer kinetic modeling and positron emission tomography (PET). Compartmental models are one way of characterizing the behavior of tracers in physiological systems. This paper describes a general method of estimating compartmental model rate constants from measurements of the concentration of tracers in blood and tissue, taken at multiple time intervals. A computer program which applies the method is described, and examples are shown for simulated and actual data acquired from the Donner 280-Crystal Positron Tomograph.

  19. Geostatistics and Analysis of Spatial Data

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2007-01-01

    , specifically the spherical, the exponential and the Gaussian models. Equations to carry out simple og ordinary kriging are deduced. Other types of kriging are mentioned, and references to international literature, Internet addresses and state-of-the-art software in the field are given. A very simple example...... to illustrate the computations and a more realistic example with height data from an area near Slagelse, Denmark, are given. Finally, a series of attractive characteristics of kriging are mentioned, and a simple sampling strategic consideration is given based on the dependence of the kriging variance...

  20. Data Analysis Techniques for Physical Scientists

    Science.gov (United States)

    Pruneau, Claude A.

    2017-10-01

    Preface; How to read this book; 1. The scientific method; Part I. Foundation in Probability and Statistics: 2. Probability; 3. Probability models; 4. Classical inference I: estimators; 5. Classical inference II: optimization; 6. Classical inference III: confidence intervals and statistical tests; 7. Bayesian inference; Part II. Measurement Techniques: 8. Basic measurements; 9. Event reconstruction; 10. Correlation functions; 11. The multiple facets of correlation functions; 12. Data correction methods; Part III. Simulation Techniques: 13. Monte Carlo methods; 14. Collision and detector modeling; List of references; Index.