WorldWideScience

Sample records for nonparametric statistical methods

  1. Application of nonparametric statistic method for DNBR limit calculation

    International Nuclear Information System (INIS)

    Dong Bo; Kuang Bo; Zhu Xuenong

    2013-01-01

    Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)

  2. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  3. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2014-01-01

    Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.

  4. Decision support using nonparametric statistics

    CERN Document Server

    Beatty, Warren

    2018-01-01

    This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.

  5. Nonparametric statistics for social and behavioral sciences

    CERN Document Server

    Kraska-MIller, M

    2013-01-01

    Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency

  6. Statistical analysis using the Bayesian nonparametric method for irradiation embrittlement of reactor pressure vessels

    Energy Technology Data Exchange (ETDEWEB)

    Takamizawa, Hisashi, E-mail: takamizawa.hisashi@jaea.go.jp; Itoh, Hiroto, E-mail: ito.hiroto@jaea.go.jp; Nishiyama, Yutaka, E-mail: nishiyama.yutaka93@jaea.go.jp

    2016-10-15

    In order to understand neutron irradiation embrittlement in high fluence regions, statistical analysis using the Bayesian nonparametric (BNP) method was performed for the Japanese surveillance and material test reactor irradiation database. The BNP method is essentially expressed as an infinite summation of normal distributions, with input data being subdivided into clusters with identical statistical parameters, such as mean and standard deviation, for each cluster to estimate shifts in ductile-to-brittle transition temperature (DBTT). The clusters typically depend on chemical compositions, irradiation conditions, and the irradiation embrittlement. Specific variables contributing to the irradiation embrittlement include the content of Cu, Ni, P, Si, and Mn in the pressure vessel steels, neutron flux, neutron fluence, and irradiation temperatures. It was found that the measured shifts of DBTT correlated well with the calculated ones. Data associated with the same materials were subdivided into the same clusters even if neutron fluences were increased.

  7. The application of non-parametric statistical method for an ALARA implementation

    International Nuclear Information System (INIS)

    Cho, Young Ho; Herr, Young Hoi

    2003-01-01

    The cost-effective reduction of Occupational Radiation Dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results was verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data

  8. Non-parametric order statistics method applied to uncertainty propagation in fuel rod calculations

    International Nuclear Information System (INIS)

    Arimescu, V.E.; Heins, L.

    2001-01-01

    Advances in modeling fuel rod behavior and accumulations of adequate experimental data have made possible the introduction of quantitative methods to estimate the uncertainty of predictions made with best-estimate fuel rod codes. The uncertainty range of the input variables is characterized by a truncated distribution which is typically a normal, lognormal, or uniform distribution. While the distribution for fabrication parameters is defined to cover the design or fabrication tolerances, the distribution of modeling parameters is inferred from the experimental database consisting of separate effects tests and global tests. The final step of the methodology uses a Monte Carlo type of random sampling of all relevant input variables and performs best-estimate code calculations to propagate these uncertainties in order to evaluate the uncertainty range of outputs of interest for design analysis, such as internal rod pressure and fuel centerline temperature. The statistical method underlying this Monte Carlo sampling is non-parametric order statistics, which is perfectly suited to evaluate quantiles of populations with unknown distribution. The application of this method is straightforward in the case of one single fuel rod, when a 95/95 statement is applicable: 'with a probability of 95% and confidence level of 95% the values of output of interest are below a certain value'. Therefore, the 0.95-quantile is estimated for the distribution of all possible values of one fuel rod with a statistical confidence of 95%. On the other hand, a more elaborate procedure is required if all the fuel rods in the core are being analyzed. In this case, the aim is to evaluate the following global statement: with 95% confidence level, the expected number of fuel rods which are not exceeding a certain value is all the fuel rods in the core except only a few fuel rods. In both cases, the thresholds determined by the analysis should be below the safety acceptable design limit. An indirect

  9. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  10. A nonparametric statistical method for determination of a confidence interval for the mean of a set of results obtained in a laboratory intercomparison

    International Nuclear Information System (INIS)

    Veglia, A.

    1981-08-01

    In cases where sets of data are obviously not normally distributed, the application of a nonparametric method for the estimation of a confidence interval for the mean seems to be more suitable than some other methods because such a method requires few assumptions about the population of data. A two-step statistical method is proposed which can be applied to any set of analytical results: elimination of outliers by a nonparametric method based on Tchebycheff's inequality, and determination of a confidence interval for the mean by a non-parametric method based on binominal distribution. The method is appropriate only for samples of size n>=10

  11. Nonparametric statistics with applications to science and engineering

    CERN Document Server

    Kvam, Paul H

    2007-01-01

    A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...

  12. Teaching Nonparametric Statistics Using Student Instrumental Values.

    Science.gov (United States)

    Anderson, Jonathan W.; Diddams, Margaret

    Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…

  13. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2004-01-01

    Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric

  14. A nonparametric spatial scan statistic for continuous data.

    Science.gov (United States)

    Jung, Inkyung; Cho, Ho Jin

    2015-10-20

    Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.

  15. Recent Advances and Trends in Nonparametric Statistics

    CERN Document Server

    Akritas, MG

    2003-01-01

    The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o

  16. Introduction to nonparametric statistics for the biological sciences using R

    CERN Document Server

    MacFarland, Thomas W

    2016-01-01

    This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...

  17. STATCAT, Statistical Analysis of Parametric and Non-Parametric Data

    International Nuclear Information System (INIS)

    David, Hugh

    1990-01-01

    1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required

  18. 2nd Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Manteiga, Wenceslao; Romo, Juan

    2016-01-01

    This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...

  19. Application of nonparametric statistics to material strength/reliability assessment

    International Nuclear Information System (INIS)

    Arai, Taketoshi

    1992-01-01

    An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)

  20. Nonparametric methods for volatility density estimation

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2009-01-01

    Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on

  1. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  2. Methodology in robust and nonparametric statistics

    CERN Document Server

    Jurecková, Jana; Picek, Jan

    2012-01-01

    Introduction and SynopsisIntroductionSynopsisPreliminariesIntroductionInference in Linear ModelsRobustness ConceptsRobust and Minimax Estimation of LocationClippings from Probability and Asymptotic TheoryProblemsRobust Estimation of Location and RegressionIntroductionM-EstimatorsL-EstimatorsR-EstimatorsMinimum Distance and Pitman EstimatorsDifferentiable Statistical FunctionsProblemsAsymptotic Representations for L-Estimators

  3. Statistical decisions under nonparametric a priori information

    International Nuclear Information System (INIS)

    Chilingaryan, A.A.

    1985-01-01

    The basic module of applied program package for statistical analysis of the ANI experiment data is described. By means of this module tasks of choosing theoretical model most adequately fitting to experimental data, selection of events of definte type, identification of elementary particles are carried out. For mentioned problems solving, the Bayesian rules, one-leave out test and KNN (K Nearest Neighbour) adaptive density estimation are utilized

  4. Nonparametric statistics a step-by-step approach

    CERN Document Server

    Corder, Gregory W

    2014-01-01

    "…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory.  It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca

  5. portfolio optimization based on nonparametric estimation methods

    Directory of Open Access Journals (Sweden)

    mahsa ghandehari

    2017-03-01

    Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.

  6. Categorical and nonparametric data analysis choosing the best statistical technique

    CERN Document Server

    Nussbaum, E Michael

    2014-01-01

    Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain

  7. 1st Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Lahiri, S; Politis, Dimitris

    2014-01-01

    This volume is composed of peer-reviewed papers that have developed from the First Conference of the International Society for NonParametric Statistics (ISNPS). This inaugural conference took place in Chalkidiki, Greece, June 15-19, 2012. It was organized with the co-sponsorship of the IMS, the ISI, and other organizations. M.G. Akritas, S.N. Lahiri, and D.N. Politis are the first executive committee members of ISNPS, and the editors of this volume. ISNPS has a distinguished Advisory Committee that includes Professors R.Beran, P.Bickel, R. Carroll, D. Cook, P. Hall, R. Johnson, B. Lindsay, E. Parzen, P. Robinson, M. Rosenblatt, G. Roussas, T. SubbaRao, and G. Wahba. The Charting Committee of ISNPS consists of more than 50 prominent researchers from all over the world.   The chapters in this volume bring forth recent advances and trends in several areas of nonparametric statistics. In this way, the volume facilitates the exchange of research ideas, promotes collaboration among researchers from all over the wo...

  8. Nonparametric methods in actigraphy: An update

    Directory of Open Access Journals (Sweden)

    Bruno S.B. Gonçalves

    2014-09-01

    Full Text Available Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm results for each time interval. Simulated data showed that (1 synchronization analysis depends on sample size, and (2 fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization.

  9. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    Directory of Open Access Journals (Sweden)

    Zhanchao Li

    2013-01-01

    Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.

  10. Modern nonparametric, robust and multivariate methods festschrift in honour of Hannu Oja

    CERN Document Server

    Taskinen, Sara

    2015-01-01

    Written by leading experts in the field, this edited volume brings together the latest findings in the area of nonparametric, robust and multivariate statistical methods. The individual contributions cover a wide variety of topics ranging from univariate nonparametric methods to robust methods for complex data structures. Some examples from statistical signal processing are also given. The volume is dedicated to Hannu Oja on the occasion of his 65th birthday and is intended for researchers as well as PhD students with a good knowledge of statistics.

  11. Statistical methods

    CERN Document Server

    Szulc, Stefan

    1965-01-01

    Statistical Methods provides a discussion of the principles of the organization and technique of research, with emphasis on its application to the problems in social statistics. This book discusses branch statistics, which aims to develop practical ways of collecting and processing numerical data and to adapt general statistical methods to the objectives in a given field.Organized into five parts encompassing 22 chapters, this book begins with an overview of how to organize the collection of such information on individual units, primarily as accomplished by government agencies. This text then

  12. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    OpenAIRE

    Li, Zhanchao; Gu, Chongshi; Wu, Zhongru

    2013-01-01

    The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model ...

  13. Comparing parametric and nonparametric regression methods for panel data

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...

  14. Examples of the Application of Nonparametric Information Geometry to Statistical Physics

    Directory of Open Access Journals (Sweden)

    Giovanni Pistone

    2013-09-01

    Full Text Available We review a nonparametric version of Amari’s information geometry in which the set of positive probability densities on a given sample space is endowed with an atlas of charts to form a differentiable manifold modeled on Orlicz Banach spaces. This nonparametric setting is used to discuss the setting of typical problems in machine learning and statistical physics, such as black-box optimization, Kullback-Leibler divergence, Boltzmann-Gibbs entropy and the Boltzmann equation.

  15. A robust nonparametric method for quantifying undetected extinctions.

    Science.gov (United States)

    Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E

    2016-06-01

    How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions. © 2016 Society for Conservation Biology.

  16. A non-parametric method for correction of global radiation observations

    DEFF Research Database (Denmark)

    Bacher, Peder; Madsen, Henrik; Perers, Bengt

    2013-01-01

    in the observations are corrected. These are errors such as: tilt in the leveling of the sensor, shadowing from surrounding objects, clipping and saturation in the signal processing, and errors from dirt and wear. The method is based on a statistical non-parametric clear-sky model which is applied to both...

  17. Investigation of MLE in nonparametric estimation methods of reliability function

    International Nuclear Information System (INIS)

    Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo

    2001-01-01

    There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not

  18. Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system

    Science.gov (United States)

    Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.

    2018-02-01

    In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.

  19. Nonparametric Bayesian predictive distributions for future order statistics

    Science.gov (United States)

    Richard A. Johnson; James W. Evans; David W. Green

    1999-01-01

    We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...

  20. International Conference on Robust Rank-Based and Nonparametric Methods

    CERN Document Server

    McKean, Joseph

    2016-01-01

    The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...

  1. Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.

    Science.gov (United States)

    1979-12-01

    also note that the results in Section 2 do not depend on the support of F .) This shock model have been studied by Esary, Marshall and Proschan (1973...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2... Mathematical Statistics, Vol. 4, pp. 894-906. Billingsley, P. (1968), CONVERGENCE OF PROBABILITY MEASURES, John Wiley, New York. BUhlmann, H. (1970

  2. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    2012-01-01

    by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...

  3. THE GROWTH POINTS OF STATISTICAL METHODS

    OpenAIRE

    Orlov A. I.

    2014-01-01

    On the basis of a new paradigm of applied mathematical statistics, data analysis and economic-mathematical methods are identified; we have also discussed five topical areas in which modern applied statistics is developing as well as the other statistical methods, i.e. five "growth points" – nonparametric statistics, robustness, computer-statistical methods, statistics of interval data, statistics of non-numeric data

  4. Statistical methods for ranking data

    CERN Document Server

    Alvo, Mayer

    2014-01-01

    This book introduces advanced undergraduate, graduate students and practitioners to statistical methods for ranking data. An important aspect of nonparametric statistics is oriented towards the use of ranking data. Rank correlation is defined through the notion of distance functions and the notion of compatibility is introduced to deal with incomplete data. Ranking data are also modeled using a variety of modern tools such as CART, MCMC, EM algorithm and factor analysis. This book deals with statistical methods used for analyzing such data and provides a novel and unifying approach for hypotheses testing. The techniques described in the book are illustrated with examples and the statistical software is provided on the authors’ website.

  5. Notes on the Implementation of Non-Parametric Statistics within the Westinghouse Realistic Large Break LOCA Evaluation Model (ASTRUM)

    International Nuclear Information System (INIS)

    Frepoli, Cesare; Oriani, Luca

    2006-01-01

    In recent years, non-parametric or order statistics methods have been widely used to assess the impact of the uncertainties within Best-Estimate LOCA evaluation models. The bounding of the uncertainties is achieved with a direct Monte Carlo sampling of the uncertainty attributes, with the minimum trial number selected to 'stabilize' the estimation of the critical output values (peak cladding temperature (PCT), local maximum oxidation (LMO), and core-wide oxidation (CWO A non-parametric order statistics uncertainty analysis was recently implemented within the Westinghouse Realistic Large Break LOCA evaluation model, also referred to as 'Automated Statistical Treatment of Uncertainty Method' (ASTRUM). The implementation or interpretation of order statistics in safety analysis is not fully consistent within the industry. This has led to an extensive public debate among regulators and researchers which can be found in the open literature. The USNRC-approved Westinghouse method follows a rigorous implementation of the order statistics theory, which leads to the execution of 124 simulations within a Large Break LOCA analysis. This is a solid approach which guarantees that a bounding value (at 95% probability) of the 95 th percentile for each of the three 10 CFR 50.46 ECCS design acceptance criteria (PCT, LMO and CWO) is obtained. The objective of this paper is to provide additional insights on the ASTRUM statistical approach, with a more in-depth analysis of pros and cons of the order statistics and of the Westinghouse approach in the implementation of this statistical methodology. (authors)

  6. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...

  7. Digital spectral analysis parametric, non-parametric and advanced methods

    CERN Document Server

    Castanié, Francis

    2013-01-01

    Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a

  8. A nonparametric approach to calculate critical micelle concentrations: the local polynomial regression method

    Energy Technology Data Exchange (ETDEWEB)

    Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)

    2004-02-01

    The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)

  9. On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression.

    Science.gov (United States)

    Pan, Wei

    2003-07-22

    Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly. Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.

  10. Nonparametric method for failures detection and localization in the actuating subsystem of aircraft control system

    Science.gov (United States)

    Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.

    2018-02-01

    In this paper we design a nonparametric method for failures detection and localization in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on algebraic solvability conditions for the aircraft model identification problem. This makes it possible to significantly increase the efficiency of detection and localization problem solution by completely eliminating errors, associated with aircraft model uncertainties.

  11. Impulse response identification with deterministic inputs using non-parametric methods

    International Nuclear Information System (INIS)

    Bhargava, U.K.; Kashyap, R.L.; Goodman, D.M.

    1985-01-01

    This paper addresses the problem of impulse response identification using non-parametric methods. Although the techniques developed herein apply to the truncated, untruncated, and the circulant models, we focus on the truncated model which is useful in certain applications. Two methods of impulse response identification will be presented. The first is based on the minimization of the C/sub L/ Statistic, which is an estimate of the mean-square prediction error; the second is a Bayesian approach. For both of these methods, we consider the effects of using both the identity matrix and the Laplacian matrix as weights on the energy in the impulse response. In addition, we present a method for estimating the effective length of the impulse response. Estimating the length is particularly important in the truncated case. Finally, we develop a method for estimating the noise variance at the output. Often, prior information on the noise variance is not available, and a good estimate is crucial to the success of estimating the impulse response with a nonparametric technique

  12. Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method.

    Science.gov (United States)

    Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A

    2017-06-30

    Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Theory of nonparametric tests

    CERN Document Server

    Dickhaus, Thorsten

    2018-01-01

    This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.

  14. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...

  15. Nonparametric Methods in Astronomy: Think, Regress, Observe—Pick Any Three

    Science.gov (United States)

    Steinhardt, Charles L.; Jermyn, Adam S.

    2018-02-01

    Telescopes are much more expensive than astronomers, so it is essential to minimize required sample sizes by using the most data-efficient statistical methods possible. However, the most commonly used model-independent techniques for finding the relationship between two variables in astronomy are flawed. In the worst case they can lead without warning to subtly yet catastrophically wrong results, and even in the best case they require more data than necessary. Unfortunately, there is no single best technique for nonparametric regression. Instead, we provide a guide for how astronomers can choose the best method for their specific problem and provide a python library with both wrappers for the most useful existing algorithms and implementations of two new algorithms developed here.

  16. Nonparametric identification of nonlinear dynamic systems using a synchronisation-based method

    Science.gov (United States)

    Kenderi, Gábor; Fidlin, Alexander

    2014-12-01

    The present study proposes an identification method for highly nonlinear mechanical systems that does not require a priori knowledge of the underlying nonlinearities to reconstruct arbitrary restoring force surfaces between degrees of freedom. This approach is based on the master-slave synchronisation between a dynamic model of the system as the slave and the real system as the master using measurements of the latter. As the model synchronises to the measurements, it becomes an observer of the real system. The optimal observer algorithm in a least-squares sense is given by the Kalman filter. Using the well-known state augmentation technique, the Kalman filter can be turned into a dual state and parameter estimator to identify parameters of a priori characterised nonlinearities. The paper proposes an extension of this technique towards nonparametric identification. A general system model is introduced by describing the restoring forces as bilateral spring-dampers with time-variant coefficients, which are estimated as augmented states. The estimation procedure is followed by an a posteriori statistical analysis to reconstruct noise-free restoring force characteristics using the estimated states and their estimated variances. Observability is provided using only one measured mechanical quantity per degree of freedom, which makes this approach less demanding in the number of necessary measurement signals compared with truly nonparametric solutions, which typically require displacement, velocity and acceleration signals. Additionally, due to the statistical rigour of the procedure, it successfully addresses signals corrupted by significant measurement noise. In the present paper, the method is described in detail, which is followed by numerical examples of one degree of freedom (1DoF) and 2DoF mechanical systems with strong nonlinearities of vibro-impact type to demonstrate the effectiveness of the proposed technique.

  17. Transition redshift: new constraints from parametric and nonparametric methods

    Energy Technology Data Exchange (ETDEWEB)

    Rani, Nisha; Mahajan, Shobhit; Mukherjee, Amitabha [Department of Physics and Astrophysics, University of Delhi, New Delhi 110007 (India); Jain, Deepak [Deen Dayal Upadhyaya College, University of Delhi, New Delhi 110015 (India); Pires, Nilza, E-mail: nrani@physics.du.ac.in, E-mail: djain@ddu.du.ac.in, E-mail: shobhit.mahajan@gmail.com, E-mail: amimukh@gmail.com, E-mail: npires@dfte.ufrn.br [Departamento de Física Teórica e Experimental, UFRN, Campus Universitário, Natal, RN 59072-970 (Brazil)

    2015-12-01

    In this paper, we use the cosmokinematics approach to study the accelerated expansion of the Universe. This is a model independent approach and depends only on the assumption that the Universe is homogeneous and isotropic and is described by the FRW metric. We parametrize the deceleration parameter, q(z), to constrain the transition redshift (z{sub t}) at which the expansion of the Universe goes from a decelerating to an accelerating phase. We use three different parametrizations of q(z) namely, q{sub I}(z)=q{sub 1}+q{sub 2}z, q{sub II} (z) = q{sub 3} + q{sub 4} ln (1 + z) and q{sub III} (z)=½+q{sub 5}/(1+z){sup 2}. A joint analysis of the age of galaxies, strong lensing and supernovae Ia data indicates that the transition redshift is less than unity i.e. z{sub t} < 1. We also use a nonparametric approach (LOESS+SIMEX) to constrain z{sub t}. This too gives z{sub t} < 1 which is consistent with the value obtained by the parametric approach.

  18. Statistical analysis of water-quality data containing multiple detection limits II: S-language software for nonparametric distribution modeling and hypothesis testing

    Science.gov (United States)

    Lee, L.; Helsel, D.

    2007-01-01

    Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.

  19. Performances of non-parametric statistics in sensitivity analysis and parameter ranking

    International Nuclear Information System (INIS)

    Saltelli, A.

    1987-01-01

    Twelve parametric and non-parametric sensitivity analysis techniques are compared in the case of non-linear model responses. The test models used are taken from the long-term risk analysis for the disposal of high level radioactive waste in a geological formation. They describe the transport of radionuclides through a set of engineered and natural barriers from the repository to the biosphere and to man. The output data from these models are the dose rates affecting the maximum exposed individual of a critical group at a given point in time. All the techniques are applied to the output from the same Monte Carlo simulations, where a modified version of Latin Hypercube method is used for the sample selection. Hypothesis testing is systematically applied to quantify the degree of confidence in the results given by the various sensitivity estimators. The estimators are ranked according to their robustness and stability, on the basis of two test cases. The conclusions are that no estimator can be considered the best from all points of view and recommend the use of more than just one estimator in sensitivity analysis

  20. Speeding Up Non-Parametric Bootstrap Computations for Statistics Based on Sample Moments in Small/Moderate Sample Size Applications.

    Directory of Open Access Journals (Sweden)

    Elias Chaibub Neto

    Full Text Available In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson's sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling.

  1. Robust variable selection method for nonparametric differential equation models with application to nonlinear dynamic gene regulatory network analysis.

    Science.gov (United States)

    Lu, Tao

    2016-01-01

    The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.

  2. Parametric and non-parametric approach for sensory RATA (Rate-All-That-Apply) method of ledre profile attributes

    Science.gov (United States)

    Hastuti, S.; Harijono; Murtini, E. S.; Fibrianto, K.

    2018-03-01

    This current study is aimed to investigate the use of parametric and non-parametric approach for sensory RATA (Rate-All-That-Apply) method. Ledre as Bojonegoro unique local food product was used as point of interest, in which 319 panelists were involved in the study. The result showed that ledre is characterized as easy-crushed texture, sticky in mouth, stingy sensation and easy to swallow. It has also strong banana flavour with brown in colour. Compared to eggroll and semprong, ledre has more variances in terms of taste as well the roll length. As RATA questionnaire is designed to collect categorical data, non-parametric approach is the common statistical procedure. However, similar results were also obtained as parametric approach, regardless the fact of non-normal distributed data. Thus, it suggests that parametric approach can be applicable for consumer study with large number of respondents, even though it may not satisfy the assumption of ANOVA (Analysis of Variances).

  3. Experimental Sentinel-2 LAI estimation using parametric, non-parametric and physical retrieval methods - A comparison

    NARCIS (Netherlands)

    Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José

    2015-01-01

    Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),

  4. Statistical trend analysis methods for temporal phenomena

    International Nuclear Information System (INIS)

    Lehtinen, E.; Pulkkinen, U.; Poern, K.

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods

  5. Statistical trend analysis methods for temporal phenomena

    Energy Technology Data Exchange (ETDEWEB)

    Lehtinen, E.; Pulkkinen, U. [VTT Automation, (Finland); Poern, K. [Poern Consulting, Nykoeping (Sweden)

    1997-04-01

    We consider point events occurring in a random way in time. In many applications the pattern of occurrence is of intrinsic interest as indicating a trend or some other systematic feature in the rate of occurrence. The purpose of this report is to survey briefly different statistical trend analysis methods and illustrate their applicability to temporal phenomena in particular. The trend testing of point events is usually seen as the testing of the hypotheses concerning the intensity of the occurrence of events. When the intensity function is parametrized, the testing of trend is a typical parametric testing problem. In industrial applications the operational experience generally does not suggest any specified model and method in advance. Therefore, and particularly, if the Poisson process assumption is very questionable, it is desirable to apply tests that are valid for a wide variety of possible processes. The alternative approach for trend testing is to use some non-parametric procedure. In this report we have presented four non-parametric tests: The Cox-Stuart test, the Wilcoxon signed ranks test, the Mann test, and the exponential ordered scores test. In addition to the classical parametric and non-parametric approaches we have also considered the Bayesian trend analysis. First we discuss a Bayesian model, which is based on a power law intensity model. The Bayesian statistical inferences are based on the analysis of the posterior distribution of the trend parameters, and the probability of trend is immediately seen from these distributions. We applied some of the methods discussed in an example case. It should be noted, that this report is a feasibility study rather than a scientific evaluation of statistical methods, and the examples can only be seen as demonstrations of the methods. 14 refs, 10 figs.

  6. Methods of statistical physics

    CERN Document Server

    Akhiezer, Aleksandr I

    1981-01-01

    Methods of Statistical Physics is an exposition of the tools of statistical mechanics, which evaluates the kinetic equations of classical and quantized systems. The book also analyzes the equations of macroscopic physics, such as the equations of hydrodynamics for normal and superfluid liquids and macroscopic electrodynamics. The text gives particular attention to the study of quantum systems. This study begins with a discussion of problems of quantum statistics with a detailed description of the basics of quantum mechanics along with the theory of measurement. An analysis of the asymptotic be

  7. Using continuous time stochastic modelling and nonparametric statistics to improve the quality of first principles models

    DEFF Research Database (Denmark)

    A methodology is presented that combines modelling based on first principles and data based modelling into a modelling cycle that facilitates fast decision-making based on statistical methods. A strong feature of this methodology is that given a first principles model along with process data......, the corresponding modelling cycle model of the given system for a given purpose. A computer-aided tool, which integrates the elements of the modelling cycle, is also presented, and an example is given of modelling a fed-batch bioreactor....

  8. [Do we always correctly interpret the results of statistical nonparametric tests].

    Science.gov (United States)

    Moczko, Jerzy A

    2014-01-01

    Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.

  9. Hadron Energy Reconstruction for ATLAS Barrel Combined Calorimeter Using Non-Parametrical Method

    CERN Document Server

    Kulchitskii, Yu A

    2000-01-01

    Hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter in the framework of the non-parametrical method is discussed. The non-parametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to fast energy reconstruction in a first level trigger. The reconstructed mean values of the hadron energies are within \\pm1% of the true values and the fractional energy resolution is [(58\\pm 3)%{\\sqrt{GeV}}/\\sqrt{E}+(2.5\\pm0.3)%]\\bigoplus(1.7\\pm0.2) GeV/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74\\pm0.04. Results of a study of the longitudinal hadronic shower development are also presented.

  10. The Kernel Mixture Network: A Nonparametric Method for Conditional Density Estimation of Continuous Random Variables

    OpenAIRE

    Ambrogioni, Luca; Güçlü, Umut; van Gerven, Marcel A. J.; Maris, Eric

    2017-01-01

    This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen ...

  11. Bootstrapping the economy -- a non-parametric method of generating consistent future scenarios

    OpenAIRE

    Müller, Ulrich A; Bürgi, Roland; Dacorogna, Michel M

    2004-01-01

    The fortune and the risk of a business venture depends on the future course of the economy. There is a strong demand for economic forecasts and scenarios that can be applied to planning and modeling. While there is an ongoing debate on modeling economic scenarios, the bootstrapping (or resampling) approach presented here has several advantages. As a non-parametric method, it directly relies on past market behaviors rather than debatable assumptions on models and parameters. Simultaneous dep...

  12. Statistical Methods for Environmental Pollution Monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, Richard O. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    1987-01-01

    The application of statistics to environmental pollution monitoring studies requires a knowledge of statistical analysis methods particularly well suited to pollution data. This book fills that need by providing sampling plans, statistical tests, parameter estimation procedure techniques, and references to pertinent publications. Most of the statistical techniques are relatively simple, and examples, exercises, and case studies are provided to illustrate procedures. The book is logically divided into three parts. Chapters 1, 2, and 3 are introductory chapters. Chapters 4 through 10 discuss field sampling designs and Chapters 11 through 18 deal with a broad range of statistical analysis procedures. Some statistical techniques given here are not commonly seen in statistics book. For example, see methods for handling correlated data (Sections 4.5 and 11.12), for detecting hot spots (Chapter 10), and for estimating a confidence interval for the mean of a lognormal distribution (Section 13.2). Also, Appendix B lists a computer code that estimates and tests for trends over time at one or more monitoring stations using nonparametric methods (Chapters 16 and 17). Unfortunately, some important topics could not be included because of their complexity and the need to limit the length of the book. For example, only brief mention could be made of time series analysis using Box-Jenkins methods and of kriging techniques for estimating spatial and spatial-time patterns of pollution, although multiple references on these topics are provided. Also, no discussion of methods for assessing risks from environmental pollution could be included.

  13. Nonparametric Information Geometry: From Divergence Function to Referential-Representational Biduality on Statistical Manifolds

    Directory of Open Access Journals (Sweden)

    Jun Zhang

    2013-12-01

    Full Text Available Divergence functions are the non-symmetric “distance” on the manifold, Μθ, of parametric probability density functions over a measure space, (Χ,μ. Classical information geometry prescribes, on Μθ: (i a Riemannian metric given by the Fisher information; (ii a pair of dual connections (giving rise to the family of α-connections that preserve the metric under parallel transport by their joint actions; and (iii a family of divergence functions ( α-divergence defined on Μθ x Μθ, which induce the metric and the dual connections. Here, we construct an extension of this differential geometric structure from Μθ (that of parametric probability density functions to the manifold, Μ, of non-parametric functions on X, removing the positivity and normalization constraints. The generalized Fisher information and α-connections on M are induced by an α-parameterized family of divergence functions, reflecting the fundamental convex inequality associated with any smooth and strictly convex function. The infinite-dimensional manifold, M, has zero curvature for all these α-connections; hence, the generally non-zero curvature of M can be interpreted as arising from an embedding of Μθ into Μ. Furthermore, when a parametric model (after a monotonic scaling forms an affine submanifold, its natural and expectation parameters form biorthogonal coordinates, and such a submanifold is dually flat for α = ± 1, generalizing the results of Amari’s α-embedding. The present analysis illuminates two different types of duality in information geometry, one concerning the referential status of a point (measurable function expressed in the divergence function (“referential duality” and the other concerning its representation under an arbitrary monotone scaling (“representational duality”.

  14. Tremor Detection Using Parametric and Non-Parametric Spectral Estimation Methods: A Comparison with Clinical Assessment

    Science.gov (United States)

    Martinez Manzanera, Octavio; Elting, Jan Willem; van der Hoeven, Johannes H.; Maurits, Natasha M.

    2016-01-01

    In the clinic, tremor is diagnosed during a time-limited process in which patients are observed and the characteristics of tremor are visually assessed. For some tremor disorders, a more detailed analysis of these characteristics is needed. Accelerometry and electromyography can be used to obtain a better insight into tremor. Typically, routine clinical assessment of accelerometry and electromyography data involves visual inspection by clinicians and occasionally computational analysis to obtain objective characteristics of tremor. However, for some tremor disorders these characteristics may be different during daily activity. This variability in presentation between the clinic and daily life makes a differential diagnosis more difficult. A long-term recording of tremor by accelerometry and/or electromyography in the home environment could help to give a better insight into the tremor disorder. However, an evaluation of such recordings using routine clinical standards would take too much time. We evaluated a range of techniques that automatically detect tremor segments in accelerometer data, as accelerometer data is more easily obtained in the home environment than electromyography data. Time can be saved if clinicians only have to evaluate the tremor characteristics of segments that have been automatically detected in longer daily activity recordings. We tested four non-parametric methods and five parametric methods on clinical accelerometer data from 14 patients with different tremor disorders. The consensus between two clinicians regarding the presence or absence of tremor on 3943 segments of accelerometer data was employed as reference. The nine methods were tested against this reference to identify their optimal parameters. Non-parametric methods generally performed better than parametric methods on our dataset when optimal parameters were used. However, one parametric method, employing the high frequency content of the tremor bandwidth under consideration

  15. Inferential, non-parametric statistics to assess the quality of probabilistic forecast systems

    NARCIS (Netherlands)

    Maia, A.H.N.; Meinke, H.B.; Lennox, S.; Stone, R.C.

    2007-01-01

    Many statistical forecast systems are available to interested users. To be useful for decision making, these systems must be based on evidence of underlying mechanisms. Once causal connections between the mechanism and its statistical manifestation have been firmly established, the forecasts must

  16. Post-fire debris flow prediction in Western United States: Advancements based on a nonparametric statistical technique

    Science.gov (United States)

    Nikolopoulos, E. I.; Destro, E.; Bhuiyan, M. A. E.; Borga, M., Sr.; Anagnostou, E. N.

    2017-12-01

    Fire disasters affect modern societies at global scale inducing significant economic losses and human casualties. In addition to their direct impacts they have various adverse effects on hydrologic and geomorphologic processes of a region due to the tremendous alteration of the landscape characteristics (vegetation, soil properties etc). As a consequence, wildfires often initiate a cascade of hazards such as flash floods and debris flows that usually follow the occurrence of a wildfire thus magnifying the overall impact in a region. Post-fire debris flows (PFDF) is one such type of hazards frequently occurring in Western United States where wildfires are a common natural disaster. Prediction of PDFD is therefore of high importance in this region and over the last years a number of efforts from United States Geological Survey (USGS) and National Weather Service (NWS) have been focused on the development of early warning systems that will help mitigate PFDF risk. This work proposes a prediction framework that is based on a nonparametric statistical technique (random forests) that allows predicting the occurrence of PFDF at regional scale with a higher degree of accuracy than the commonly used approaches that are based on power-law thresholds and logistic regression procedures. The work presented is based on a recently released database from USGS that reports a total of 1500 storms that triggered and did not trigger PFDF in a number of fire affected catchments in Western United States. The database includes information on storm characteristics (duration, accumulation, max intensity etc) and other auxiliary information of land surface properties (soil erodibility index, local slope etc). Results show that the proposed model is able to achieve a satisfactory prediction accuracy (threat score > 0.6) superior of previously published prediction frameworks highlighting the potential of nonparametric statistical techniques for development of PFDF prediction systems.

  17. Longitudinal data analysis a handbook of modern statistical methods

    CERN Document Server

    Fitzmaurice, Garrett; Verbeke, Geert; Molenberghs, Geert

    2008-01-01

    Although many books currently available describe statistical models and methods for analyzing longitudinal data, they do not highlight connections between various research threads in the statistical literature. Responding to this void, Longitudinal Data Analysis provides a clear, comprehensive, and unified overview of state-of-the-art theory and applications. It also focuses on the assorted challenges that arise in analyzing longitudinal data. After discussing historical aspects, leading researchers explore four broad themes: parametric modeling, nonparametric and semiparametric methods, joint

  18. Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science

    Science.gov (United States)

    Ju, Boryung; Jin, Tao

    2013-01-01

    Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…

  19. Nonparametric and group-based person-fit statistics : a validity study and an empirical example

    NARCIS (Netherlands)

    Meijer, R.R.

    1994-01-01

    In person-fit analysis, the object is to investigate whether an item score pattern is improbable given the item score patterns of the other persons in the group or given what is expected on the basis of a test model. In this study, several existing group-based statistics to detect such improbable

  20. Statistical methods for forecasting

    CERN Document Server

    Abraham, Bovas

    2009-01-01

    The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists."This book, it must be said, lives up to the words on its advertising cover: ''Bridging the gap between introductory, descriptive approaches and highly advanced theoretical treatises, it provides a practical, intermediate level discussion of a variety of forecasting tools, and explains how they relate to one another, both in theory and practice.'' It does just that!"-Journal of the Royal Statistical Society"A well-written work that deals with statistical methods and models that can be used to produce short-term forecasts, this book has wide-ranging applications. It could be used in the context of a study of regression, forecasting, and time series ...

  1. Comparison of Parametric and Nonparametric Methods for Analyzing the Bias of a Numerical Model

    Directory of Open Access Journals (Sweden)

    Isaac Mugume

    2016-01-01

    Full Text Available Numerical models are presently applied in many fields for simulation and prediction, operation, or research. The output from these models normally has both systematic and random errors. The study compared January 2015 temperature data for Uganda as simulated using the Weather Research and Forecast model with actual observed station temperature data to analyze the bias using parametric (the root mean square error (RMSE, the mean absolute error (MAE, mean error (ME, skewness, and the bias easy estimate (BES and nonparametric (the sign test, STM methods. The RMSE normally overestimates the error compared to MAE. The RMSE and MAE are not sensitive to direction of bias. The ME gives both direction and magnitude of bias but can be distorted by extreme values while the BES is insensitive to extreme values. The STM is robust for giving the direction of bias; it is not sensitive to extreme values but it does not give the magnitude of bias. The graphical tools (such as time series and cumulative curves show the performance of the model with time. It is recommended to integrate parametric and nonparametric methods along with graphical methods for a comprehensive analysis of bias of a numerical model.

  2. Hadron energy reconstruction for the ATLAS calorimetry in the framework of the nonparametrical method

    CERN Document Server

    Akhmadaliev, S Z; Ambrosini, G; Amorim, A; Anderson, K; Andrieux, M L; Aubert, Bernard; Augé, E; Badaud, F; Baisin, L; Barreiro, F; Battistoni, G; Bazan, A; Bazizi, K; Belymam, A; Benchekroun, D; Berglund, S R; Berset, J C; Blanchot, G; Bogush, A A; Bohm, C; Boldea, V; Bonivento, W; Bosman, M; Bouhemaid, N; Breton, D; Brette, P; Bromberg, C; Budagov, Yu A; Burdin, S V; Calôba, L P; Camarena, F; Camin, D V; Canton, B; Caprini, M; Carvalho, J; Casado, M P; Castillo, M V; Cavalli, D; Cavalli-Sforza, M; Cavasinni, V; Chadelas, R; Chalifour, M; Chekhtman, A; Chevalley, J L; Chirikov-Zorin, I E; Chlachidze, G; Citterio, M; Cleland, W E; Clément, C; Cobal, M; Cogswell, F; Colas, Jacques; Collot, J; Cologna, S; Constantinescu, S; Costa, G; Costanzo, D; Crouau, M; Daudon, F; David, J; David, M; Davidek, T; Dawson, J; De, K; de La Taille, C; Del Peso, J; Del Prete, T; de Saintignon, P; Di Girolamo, B; Dinkespiler, B; Dita, S; Dodd, J; Dolejsi, J; Dolezal, Z; Downing, R; Dugne, J J; Dzahini, D; Efthymiopoulos, I; Errede, D; Errede, S; Evans, H; Eynard, G; Fassi, F; Fassnacht, P; Ferrari, A; Ferrer, A; Flaminio, Vincenzo; Fournier, D; Fumagalli, G; Gallas, E; Gaspar, M; Giakoumopoulou, V; Gianotti, F; Gildemeister, O; Giokaris, N; Glagolev, V; Glebov, V Yu; Gomes, A; González, V; González de la Hoz, S; Grabskii, V; Graugès-Pous, E; Grenier, P; Hakopian, H H; Haney, M; Hébrard, C; Henriques, A; Hervás, L; Higón, E; Holmgren, Sven Olof; Hostachy, J Y; Hoummada, A; Huston, J; Imbault, D; Ivanyushenkov, Yu M; Jézéquel, S; Johansson, E K; Jon-And, K; Jones, R; Juste, A; Kakurin, S; Karyukhin, A N; Khokhlov, Yu A; Khubua, J I; Klioukhine, V I; Kolachev, G M; Kopikov, S V; Kostrikov, M E; Kozlov, V; Krivkova, P; Kukhtin, V V; Kulagin, M; Kulchitskii, Yu A; Kuzmin, M V; Labarga, L; Laborie, G; Lacour, D; Laforge, B; Lami, S; Lapin, V; Le Dortz, O; Lefebvre, M; Le Flour, T; Leitner, R; Leltchouk, M; Li, J; Liablin, M V; Linossier, O; Lissauer, D; Lobkowicz, F; Lokajícek, M; Lomakin, Yu F; López-Amengual, J M; Lund-Jensen, B; Maio, A; Makowiecki, D S; Malyukov, S N; Mandelli, L; Mansoulié, B; Mapelli, Livio P; Marin, C P; Marrocchesi, P S; Marroquim, F; Martin, P; Maslennikov, A L; Massol, N; Mataix, L; Mazzanti, M; Mazzoni, E; Merritt, F S; Michel, B; Miller, R; Minashvili, I A; Miralles, L; Mnatzakanian, E A; Monnier, E; Montarou, G; Mornacchi, Giuseppe; Moynot, M; Muanza, G S; Nayman, P; Némécek, S; Nessi, Marzio; Nicoleau, S; Niculescu, M; Noppe, J M; Onofre, A; Pallin, D; Pantea, D; Paoletti, R; Park, I C; Parrour, G; Parsons, J; Pereira, A; Perini, L; Perlas, J A; Perrodo, P; Pilcher, J E; Pinhão, J; Plothow-Besch, Hartmute; Poggioli, Luc; Poirot, S; Price, L; Protopopov, Yu; Proudfoot, J; Puzo, P; Radeka, V; Rahm, David Charles; Reinmuth, G; Renzoni, G; Rescia, S; Resconi, S; Richards, R; Richer, J P; Roda, C; Rodier, S; Roldán, J; Romance, J B; Romanov, V; Romero, P; Rossel, F; Rusakovitch, N A; Sala, P; Sanchis, E; Sanders, H; Santoni, C; Santos, J; Sauvage, D; Sauvage, G; Sawyer, L; Says, L P; Schaffer, A C; Schwemling, P; Schwindling, J; Seguin-Moreau, N; Seidl, W; Seixas, J M; Selldén, B; Seman, M; Semenov, A; Serin, L; Shaldaev, E; Shochet, M J; Sidorov, V; Silva, J; Simaitis, V J; Simion, S; Sissakian, A N; Snopkov, R; Söderqvist, J; Solodkov, A A; Soloviev, A; Soloviev, I V; Sonderegger, P; Soustruznik, K; Spanó, F; Spiwoks, R; Stanek, R; Starchenko, E A; Stavina, P; Stephens, R; Suk, M; Surkov, A; Sykora, I; Takai, H; Tang, F; Tardell, S; Tartarelli, F; Tas, P; Teiger, J; Thaler, J; Thion, J; Tikhonov, Yu A; Tisserant, S; Tokar, S; Topilin, N D; Trka, Z; Turcotte, M; Valkár, S; Varanda, M J; Vartapetian, A H; Vazeille, F; Vichou, I; Vinogradov, V; Vorozhtsov, S B; Vuillemin, V; White, A; Wielers, M; Wingerter-Seez, I; Wolters, H; Yamdagni, N; Yosef, C; Zaitsev, A; Zitoun, R; Zolnierowski, Y

    2002-01-01

    This paper discusses hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter (consisting of a lead-liquid argon electromagnetic part and an iron-scintillator hadronic part) in the framework of the nonparametrical method. The nonparametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to an easy use in a first level trigger. The reconstructed mean values of the hadron energies are within +or-1% of the true values and the fractional energy resolution is [(58+or-3)%/ square root E+(2.5+or-0.3)%](+)(1.7+or-0.2)/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74+or-0.04 and agrees with the prediction that e/h >1.66 for this electromagnetic calorimeter. Results of a study of the longitudinal hadronic shower development are also presented. The data have been taken in the H8 beam...

  3. Nonparametric Inference of Doubly Stochastic Poisson Process Data via the Kernel Method.

    Science.gov (United States)

    Zhang, Tingting; Kou, S C

    2010-01-01

    Doubly stochastic Poisson processes, also known as the Cox processes, frequently occur in various scientific fields. In this article, motivated primarily by analyzing Cox process data in biophysics, we propose a nonparametric kernel-based inference method. We conduct a detailed study, including an asymptotic analysis, of the proposed method, and provide guidelines for its practical use, introducing a fast and stable regression method for bandwidth selection. We apply our method to real photon arrival data from recent single-molecule biophysical experiments, investigating proteins' conformational dynamics. Our result shows that conformational fluctuation is widely present in protein systems, and that the fluctuation covers a broad range of time scales, highlighting the dynamic and complex nature of proteins' structure.

  4. On Cooper's Nonparametric Test.

    Science.gov (United States)

    Schmeidler, James

    1978-01-01

    The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)

  5. Understanding advanced statistical methods

    CERN Document Server

    Westfall, Peter

    2013-01-01

    Introduction: Probability, Statistics, and ScienceReality, Nature, Science, and ModelsStatistical Processes: Nature, Design and Measurement, and DataModelsDeterministic ModelsVariabilityParametersPurely Probabilistic Statistical ModelsStatistical Models with Both Deterministic and Probabilistic ComponentsStatistical InferenceGood and Bad ModelsUses of Probability ModelsRandom Variables and Their Probability DistributionsIntroductionTypes of Random Variables: Nominal, Ordinal, and ContinuousDiscrete Probability Distribution FunctionsContinuous Probability Distribution FunctionsSome Calculus-Derivatives and Least SquaresMore Calculus-Integrals and Cumulative Distribution FunctionsProbability Calculation and SimulationIntroductionAnalytic Calculations, Discrete and Continuous CasesSimulation-Based ApproximationGenerating Random NumbersIdentifying DistributionsIntroductionIdentifying Distributions from Theory AloneUsing Data: Estimating Distributions via the HistogramQuantiles: Theoretical and Data-Based Estimate...

  6. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang

    2017-02-16

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  7. Efficient nonparametric and asymptotic Bayesian model selection methods for attributed graph clustering

    KAUST Repository

    Xu, Zhiqiang; Cheng, James; Xiao, Xiaokui; Fujimaki, Ryohei; Muraoka, Yusuke

    2017-01-01

    Attributed graph clustering, also known as community detection on attributed graphs, attracts much interests recently due to the ubiquity of attributed graphs in real life. Many existing algorithms have been proposed for this problem, which are either distance based or model based. However, model selection in attributed graph clustering has not been well addressed, that is, most existing algorithms assume the cluster number to be known a priori. In this paper, we propose two efficient approaches for attributed graph clustering with automatic model selection. The first approach is a popular Bayesian nonparametric method, while the second approach is an asymptotic method based on a recently proposed model selection criterion, factorized information criterion. Experimental results on both synthetic and real datasets demonstrate that our approaches for attributed graph clustering with automatic model selection significantly outperform the state-of-the-art algorithm.

  8. Nonparametric statistical techniques used in dose estimation for beagles exposed to inhaled plutonium nitrate

    International Nuclear Information System (INIS)

    Stevens, D.L.; Dagle, G.E.

    1986-01-01

    Retention and translocation of inhaled radionuclides are often estimated from the sacrifice of multiple animals at different time points. The data for each time point can be averaged and a smooth curve fitted to the mean values, or a smooth curve may be fitted to the entire data set. However, an analysis based on means may not be the most appropriate if there is substantial variation in the initial amount of the radionuclide inhaled or if the data are subject to outliers. A method has been developed that takes account of these problems. The body burden is viewed as a compartmental system, with the compartments identified with body organs. A median polish is applied to the multiple logistic transform of the compartmental fractions (compartment burden/total burden) at each time point. A smooth function is fitted to the results of the median polish. This technique was applied to data from beagles exposed to an aerosol of 239 Pu(NO 3 ) 4 . Models of retention and translocation for lungs, skeleton, liver, kidneys, and tracheobronchial lymph nodes were developed and used to estimate dose. 4 refs., 3 figs., 4 tabs

  9. A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets

    Science.gov (United States)

    Carrig, Madeline M.; Manrique-Vallier, Daniel; Ranby, Krista W.; Reiter, Jerome P.; Hoyle, Rick H.

    2015-01-01

    Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches. PMID:26257437

  10. Two non-parametric methods for derivation of constraints from radiotherapy dose–histogram data

    International Nuclear Information System (INIS)

    Ebert, M A; Kennedy, A; Joseph, D J; Gulliford, S L; Buettner, F; Foo, K; Haworth, A; Denham, J W

    2014-01-01

    Dose constraints based on histograms provide a convenient and widely-used method for informing and guiding radiotherapy treatment planning. Methods of derivation of such constraints are often poorly described. Two non-parametric methods for derivation of constraints are described and investigated in the context of determination of dose-specific cut-points—values of the free parameter (e.g., percentage volume of the irradiated organ) which best reflect resulting changes in complication incidence. A method based on receiver operating characteristic (ROC) analysis and one based on a maximally-selected standardized rank sum are described and compared using rectal toxicity data from a prostate radiotherapy trial. Multiple test corrections are applied using a free step-down resampling algorithm, which accounts for the large number of tests undertaken to search for optimal cut-points and the inherent correlation between dose–histogram points. Both methods provide consistent significant cut-point values, with the rank sum method displaying some sensitivity to the underlying data. The ROC method is simple to implement and can utilize a complication atlas, though an advantage of the rank sum method is the ability to incorporate all complication grades without the need for grade dichotomization. (note)

  11. [Nonparametric method of estimating survival functions containing right-censored and interval-censored data].

    Science.gov (United States)

    Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi

    2014-04-01

    Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.

  12. A comparison of parametric and nonparametric methods for normalising cDNA microarray data.

    Science.gov (United States)

    Khondoker, Mizanur R; Glasbey, Chris A; Worton, Bruce J

    2007-12-01

    Normalisation is an essential first step in the analysis of most cDNA microarray data, to correct for effects arising from imperfections in the technology. Loess smoothing is commonly used to correct for trends in log-ratio data. However, parametric models, such as the additive plus multiplicative variance model, have been preferred for scale normalisation, though the variance structure of microarray data may be of a more complex nature than can be accommodated by a parametric model. We propose a new nonparametric approach that incorporates location and scale normalisation simultaneously using a Generalised Additive Model for Location, Scale and Shape (GAMLSS, Rigby and Stasinopoulos, 2005, Applied Statistics, 54, 507-554). We compare its performance in inferring differential expression with Huber et al.'s (2002, Bioinformatics, 18, 96-104) arsinh variance stabilising transformation (AVST) using real and simulated data. We show GAMLSS to be as powerful as AVST when the parametric model is correct, and more powerful when the model is wrong. (c) 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

  13. Statistical methods and materials characterisation

    International Nuclear Information System (INIS)

    Wallin, K.R.W.

    2010-01-01

    Statistics is a wide mathematical area, which covers a myriad of analysis and estimation options, some of which suit special cases better than others. A comprehensive coverage of the whole area of statistics would be an enormous effort and would also be outside the capabilities of this author. Therefore, this does not intend to be a textbook on statistical methods available for general data analysis and decision making. Instead it will highlight a certain special statistical case applicable to mechanical materials characterization. The methods presented here do not in any way rule out other statistical methods by which to analyze mechanical property material data. (orig.)

  14. Statistical methods in quality assurance

    International Nuclear Information System (INIS)

    Eckhard, W.

    1980-01-01

    During the different phases of a production process - planning, development and design, manufacturing, assembling, etc. - most of the decision rests on a base of statistics, the collection, analysis and interpretation of data. Statistical methods can be thought of as a kit of tools to help to solve problems in the quality functions of the quality loop with respect to produce quality products and to reduce quality costs. Various statistical methods are represented, typical examples for their practical application are demonstrated. (RW)

  15. Statistical methods for quality improvement

    National Research Council Canada - National Science Library

    Ryan, Thomas P

    2011-01-01

    ...."-TechnometricsThis new edition continues to provide the most current, proven statistical methods for quality control and quality improvementThe use of quantitative methods offers numerous benefits...

  16. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  17. Statistical Methods in Integrative Genomics

    Science.gov (United States)

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  18. Statistical methods in nonlinear dynamics

    Indian Academy of Sciences (India)

    Sensitivity to initial conditions in nonlinear dynamical systems leads to exponential divergence of trajectories that are initially arbitrarily close, and hence to unpredictability. Statistical methods have been found to be helpful in extracting useful information about such systems. In this paper, we review briefly some statistical ...

  19. Statistical Methods in Psychology Journals.

    Science.gov (United States)

    Willkinson, Leland

    1999-01-01

    Proposes guidelines for revising the American Psychological Association (APA) publication manual or other APA materials to clarify the application of statistics in research reports. The guidelines are intended to induce authors and editors to recognize the thoughtless application of statistical methods. Contains 54 references. (SLD)

  20. Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  1. Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    2003-01-01

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  2. Trends in three decades of HIV/AIDS epidemic in Thailand by nonparametric backcalculation method.

    Science.gov (United States)

    Punyacharoensin, Narat; Viwatwongkasem, Chukiat

    2009-06-01

    To reconstruct the past HIV incidence and prevalence in Thailand from 1980 to 2008 and predict the country's AIDS incidence from 2009 to 2011. Nonparametric backcalculation was adopted utilizing 100 quarterly observed new AIDS counts excluding pediatric cases. The accuracy of data was enhanced through a series of data adjustments using the weight method to account for several surveillance reporting issues. The mixture of time-dependent distributions allowed the effects of age at seroconversion and antiretroviral therapy to be incorporated simultaneously. Sensitivity analyses were conducted to assess model variations that were subject to major uncertainties. Future AIDS incidence was projected for various predetermined HIV incidence patterns. HIV incidence in Thailand reached its peak in 1992 with approximately 115,000 cases. A steep decline thereafter discontinued in 1997 and was followed by another strike of 42,000 cases in 1999. The second surge, which happened concurrently with the major economic crisis, brought on 60,000 new infections. As of December 2008, more than 1 million individuals had been infected and around 430,000 adults were living with HIV corresponding to a prevalence rate of 1.2%. The incidence rate had become less than 0.1% since 2002. The backcalculated estimates were dominated by postulated median AIDS progression time and adjustments to surveillance data. Our analysis indicated that, thus far, the 1990s was the most severe era of HIV/AIDS epidemic in Thailand with two HIV incidence peaks. A drop in new infections led to a decrease in recent AIDS incidence, and this tendency is likely to remain unchanged until 2011, if not further.

  3. Nonparametric tests for censored data

    CERN Document Server

    Bagdonavicus, Vilijandas; Nikulin, Mikhail

    2013-01-01

    This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.

  4. Statistical methods for physical science

    CERN Document Server

    Stanford, John L

    1994-01-01

    This volume of Methods of Experimental Physics provides an extensive introduction to probability and statistics in many areas of the physical sciences, with an emphasis on the emerging area of spatial statistics. The scope of topics covered is wide-ranging-the text discusses a variety of the most commonly used classical methods and addresses newer methods that are applicable or potentially important. The chapter authors motivate readers with their insightful discussions, augmenting their material withKey Features* Examines basic probability, including coverage of standard distributions, time s

  5. Statistical methods in nuclear theory

    International Nuclear Information System (INIS)

    Shubin, Yu.N.

    1974-01-01

    The paper outlines statistical methods which are widely used for describing properties of excited states of nuclei and nuclear reactions. It discusses physical assumptions lying at the basis of known distributions between levels (Wigner, Poisson distributions) and of widths of highly excited states (Porter-Thomas distribution, as well as assumptions used in the statistical theory of nuclear reactions and in the fluctuation analysis. The author considers the random matrix method, which consists in replacing the matrix elements of a residual interaction by random variables with a simple statistical distribution. Experimental data are compared with results of calculations using the statistical model. The superfluid nucleus model is considered with regard to superconducting-type pair correlations

  6. Robust statistical methods with R

    CERN Document Server

    Jureckova, Jana

    2005-01-01

    Robust statistical methods were developed to supplement the classical procedures when the data violate classical assumptions. They are ideally suited to applied research across a broad spectrum of study, yet most books on the subject are narrowly focused, overly theoretical, or simply outdated. Robust Statistical Methods with R provides a systematic treatment of robust procedures with an emphasis on practical application.The authors work from underlying mathematical tools to implementation, paying special attention to the computational aspects. They cover the whole range of robust methods, including differentiable statistical functions, distance of measures, influence functions, and asymptotic distributions, in a rigorous yet approachable manner. Highlighting hands-on problem solving, many examples and computational algorithms using the R software supplement the discussion. The book examines the characteristics of robustness, estimators of real parameter, large sample properties, and goodness-of-fit tests. It...

  7. Statistical Methods for Fuzzy Data

    CERN Document Server

    Viertl, Reinhard

    2011-01-01

    Statistical data are not always precise numbers, or vectors, or categories. Real data are frequently what is called fuzzy. Examples where this fuzziness is obvious are quality of life data, environmental, biological, medical, sociological and economics data. Also the results of measurements can be best described by using fuzzy numbers and fuzzy vectors respectively. Statistical analysis methods have to be adapted for the analysis of fuzzy data. In this book, the foundations of the description of fuzzy data are explained, including methods on how to obtain the characterizing function of fuzzy m

  8. Bayesian nonparametric data analysis

    CERN Document Server

    Müller, Peter; Jara, Alejandro; Hanson, Tim

    2015-01-01

    This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.

  9. Application of nonparametric regression methods to study the relationship between NO2 concentrations and local wind direction and speed at background sites.

    Science.gov (United States)

    Donnelly, Aoife; Misstear, Bruce; Broderick, Brian

    2011-02-15

    Background concentrations of nitrogen dioxide (NO(2)) are not constant but vary temporally and spatially. The current paper presents a powerful tool for the quantification of the effects of wind direction and wind speed on background NO(2) concentrations, particularly in cases where monitoring data are limited. In contrast to previous studies which applied similar methods to sites directly affected by local pollution sources, the current study focuses on background sites with the aim of improving methods for predicting background concentrations adopted in air quality modelling studies. The relationship between measured NO(2) concentration in air at three such sites in Ireland and locally measured wind direction has been quantified using nonparametric regression methods. The major aim was to analyse a method for quantifying the effects of local wind direction on background levels of NO(2) in Ireland. The method was expanded to include wind speed as an added predictor variable. A Gaussian kernel function is used in the analysis and circular statistics employed for the wind direction variable. Wind direction and wind speed were both found to have a statistically significant effect on background levels of NO(2) at all three sites. Frequently environmental impact assessments are based on short term baseline monitoring producing a limited dataset. The presented non-parametric regression methods, in contrast to the frequently used methods such as binning of the data, allow concentrations for missing data pairs to be estimated and distinction between spurious and true peaks in concentrations to be made. The methods were found to provide a realistic estimation of long term concentration variation with wind direction and speed, even for cases where the data set is limited. Accurate identification of the actual variation at each location and causative factors could be made, thus supporting the improved definition of background concentrations for use in air quality modelling

  10. Parametric methods outperformed non-parametric methods in comparisons of discrete numerical variables

    Directory of Open Access Journals (Sweden)

    Sandvik Leiv

    2011-04-01

    Full Text Available Abstract Background The number of events per individual is a widely reported variable in medical research papers. Such variables are the most common representation of the general variable type called discrete numerical. There is currently no consensus on how to compare and present such variables, and recommendations are lacking. The objective of this paper is to present recommendations for analysis and presentation of results for discrete numerical variables. Methods Two simulation studies were used to investigate the performance of hypothesis tests and confidence interval methods for variables with outcomes {0, 1, 2}, {0, 1, 2, 3}, {0, 1, 2, 3, 4}, and {0, 1, 2, 3, 4, 5}, using the difference between the means as an effect measure. Results The Welch U test (the T test with adjustment for unequal variances and its associated confidence interval performed well for almost all situations considered. The Brunner-Munzel test also performed well, except for small sample sizes (10 in each group. The ordinary T test, the Wilcoxon-Mann-Whitney test, the percentile bootstrap interval, and the bootstrap-t interval did not perform satisfactorily. Conclusions The difference between the means is an appropriate effect measure for comparing two independent discrete numerical variables that has both lower and upper bounds. To analyze this problem, we encourage more frequent use of parametric hypothesis tests and confidence intervals.

  11. Statistical methods in spatial genetics

    DEFF Research Database (Denmark)

    Guillot, Gilles; Leblois, Raphael; Coulon, Aurelie

    2009-01-01

    The joint analysis of spatial and genetic data is rapidly becoming the norm in population genetics. More and more studies explicitly describe and quantify the spatial organization of genetic variation and try to relate it to underlying ecological processes. As it has become increasingly difficult...... to keep abreast with the latest methodological developments, we review the statistical toolbox available to analyse population genetic data in a spatially explicit framework. We mostly focus on statistical concepts but also discuss practical aspects of the analytical methods, highlighting not only...

  12. A nonparametric mean-variance smoothing method to assess Arabidopsis cold stress transcriptional regulator CBF2 overexpression microarray data.

    Science.gov (United States)

    Hu, Pingsha; Maiti, Tapabrata

    2011-01-01

    Microarray is a powerful tool for genome-wide gene expression analysis. In microarray expression data, often mean and variance have certain relationships. We present a non-parametric mean-variance smoothing method (NPMVS) to analyze differentially expressed genes. In this method, a nonlinear smoothing curve is fitted to estimate the relationship between mean and variance. Inference is then made upon shrinkage estimation of posterior means assuming variances are known. Different methods have been applied to simulated datasets, in which a variety of mean and variance relationships were imposed. The simulation study showed that NPMVS outperformed the other two popular shrinkage estimation methods in some mean-variance relationships; and NPMVS was competitive with the two methods in other relationships. A real biological dataset, in which a cold stress transcription factor gene, CBF2, was overexpressed, has also been analyzed with the three methods. Gene ontology and cis-element analysis showed that NPMVS identified more cold and stress responsive genes than the other two methods did. The good performance of NPMVS is mainly due to its shrinkage estimation for both means and variances. In addition, NPMVS exploits a non-parametric regression between mean and variance, instead of assuming a specific parametric relationship between mean and variance. The source code written in R is available from the authors on request.

  13. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo

    2013-06-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.

  14. Nonparametric method for genomics-based prediction of performance of quantitative traits involving epistasis in plant breeding.

    Directory of Open Access Journals (Sweden)

    Xiaochun Sun

    Full Text Available Genomic selection (GS procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA and reproducing kernel Hilbert spaces (RKHS regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.

  15. Nonparametric method for genomics-based prediction of performance of quantitative traits involving epistasis in plant breeding.

    Science.gov (United States)

    Sun, Xiaochun; Ma, Ping; Mumm, Rita H

    2012-01-01

    Genomic selection (GS) procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA) and reproducing kernel Hilbert spaces (RKHS) regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC) and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.

  16. Nonequilibrium statistical mechanics ensemble method

    CERN Document Server

    Eu, Byung Chan

    1998-01-01

    In this monograph, nonequilibrium statistical mechanics is developed by means of ensemble methods on the basis of the Boltzmann equation, the generic Boltzmann equations for classical and quantum dilute gases, and a generalised Boltzmann equation for dense simple fluids The theories are developed in forms parallel with the equilibrium Gibbs ensemble theory in a way fully consistent with the laws of thermodynamics The generalised hydrodynamics equations are the integral part of the theory and describe the evolution of macroscopic processes in accordance with the laws of thermodynamics of systems far removed from equilibrium Audience This book will be of interest to researchers in the fields of statistical mechanics, condensed matter physics, gas dynamics, fluid dynamics, rheology, irreversible thermodynamics and nonequilibrium phenomena

  17. Statistical methods for quality assurance

    International Nuclear Information System (INIS)

    Rinne, H.; Mittag, H.J.

    1989-01-01

    This is the first German-language textbook on quality assurance and the fundamental statistical methods that is suitable for private study. The material for this book has been developed from a course of Hagen Open University and is characterized by a particularly careful didactical design which is achieved and supported by numerous illustrations and photographs, more than 100 exercises with complete problem solutions, many fully displayed calculation examples, surveys fostering a comprehensive approach, bibliography with comments. The textbook has an eye to practice and applications, and great care has been taken by the authors to avoid abstraction wherever appropriate, to explain the proper conditions of application of the testing methods described, and to give guidance for suitable interpretation of results. The testing methods explained also include latest developments and research results in order to foster their adoption in practice. (orig.) [de

  18. Order statistics & inference estimation methods

    CERN Document Server

    Balakrishnan, N

    1991-01-01

    The literature on order statistics and inferenc eis quite extensive and covers a large number of fields ,but most of it is dispersed throughout numerous publications. This volume is the consolidtion of the most important results and places an emphasis on estimation. Both theoretical and computational procedures are presented to meet the needs of researchers, professionals, and students. The methods of estimation discussed are well-illustrated with numerous practical examples from both the physical and life sciences, including sociology,psychology,a nd electrical and chemical engineering. A co

  19. Vortex methods and vortex statistics

    International Nuclear Information System (INIS)

    Chorin, A.J.

    1993-05-01

    Vortex methods originated from the observation that in incompressible, inviscid, isentropic flow vorticity (or, more accurately, circulation) is a conserved quantity, as can be readily deduced from the absence of tangential stresses. Thus if the vorticity is known at time t = 0, one can deduce the flow at a later time by simply following it around. In this narrow context, a vortex method is a numerical method that makes use of this observation. Even more generally, the analysis of vortex methods leads, to problems that are closely related to problems in quantum physics and field theory, as well as in harmonic analysis. A broad enough definition of vortex methods ends up by encompassing much of science. Even the purely computational aspects of vortex methods encompass a range of ideas for which vorticity may not be the best unifying theme. The author restricts himself in these lectures to a special class of numerical vortex methods, those that are based on a Lagrangian transport of vorticity in hydrodynamics by smoothed particles (''blobs'') and those whose understanding contributes to the understanding of blob methods. Vortex methods for inviscid flow lead to systems of ordinary differential equations that can be readily clothed in Hamiltonian form, both in three and two space dimensions, and they can preserve exactly a number of invariants of the Euler equations, including topological invariants. Their viscous versions resemble Langevin equations. As a result, they provide a very useful cartoon of statistical hydrodynamics, i.e., of turbulence, one that can to some extent be analyzed analytically and more importantly, explored numerically, with important implications also for superfluids, superconductors, and even polymers. In the authors view, vortex ''blob'' methods provide the most promising path to the understanding of these phenomena

  20. Bayes linear statistics, theory & methods

    CERN Document Server

    Goldstein, Michael

    2007-01-01

    Bayesian methods combine information available from data with any prior information available from expert knowledge. The Bayes linear approach follows this path, offering a quantitative structure for expressing beliefs, and systematic methods for adjusting these beliefs, given observational data. The methodology differs from the full Bayesian methodology in that it establishes simpler approaches to belief specification and analysis based around expectation judgements. Bayes Linear Statistics presents an authoritative account of this approach, explaining the foundations, theory, methodology, and practicalities of this important field. The text provides a thorough coverage of Bayes linear analysis, from the development of the basic language to the collection of algebraic results needed for efficient implementation, with detailed practical examples. The book covers:The importance of partial prior specifications for complex problems where it is difficult to supply a meaningful full prior probability specification...

  1. Statistical approach for selection of regression model during validation of bioanalytical method

    Directory of Open Access Journals (Sweden)

    Natalija Nakov

    2014-06-01

    Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.

  2. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  3. Uncertainty in decision models analyzing cost-effectiveness : The joint distribution of incremental costs and effectiveness evaluated with a nonparametric bootstrap method

    NARCIS (Netherlands)

    Hunink, Maria; Bult, J.R.; De Vries, J; Weinstein, MC

    1998-01-01

    Purpose. To illustrate the use of a nonparametric bootstrap method in the evaluation of uncertainty in decision models analyzing cost-effectiveness. Methods. The authors reevaluated a previously published cost-effectiveness analysis that used a Markov model comparing initial percutaneous

  4. Statistical methods in radiation physics

    CERN Document Server

    Turner, James E; Bogard, James S

    2012-01-01

    This statistics textbook, with particular emphasis on radiation protection and dosimetry, deals with statistical solutions to problems inherent in health physics measurements and decision making. The authors begin with a description of our current understanding of the statistical nature of physical processes at the atomic level, including radioactive decay and interactions of radiation with matter. Examples are taken from problems encountered in health physics, and the material is presented such that health physicists and most other nuclear professionals will more readily understand the application of statistical principles in the familiar context of the examples. Problems are presented at the end of each chapter, with solutions to selected problems provided online. In addition, numerous worked examples are included throughout the text.

  5. Statistical inference via fiducial methods

    OpenAIRE

    Salomé, Diemer

    1998-01-01

    In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary

  6. Comparing non-parametric methods for ungrouping coarsely aggregated age-specific distributions

    DEFF Research Database (Denmark)

    Rizzi, Silvia; Thinggaard, Mikael; Vaupel, James W.

    2016-01-01

    Demographers have often access to vital statistics that are less than ideal for the purpose of their research. In many instances demographic data are reported in coarse histograms, where the values given are only the summation of true latent values, thereby making detailed analysis troublesome. O...

  7. Register-based statistics statistical methods for administrative data

    CERN Document Server

    Wallgren, Anders

    2014-01-01

    This book provides a comprehensive and up to date treatment of  theory and practical implementation in Register-based statistics. It begins by defining the area, before explaining how to structure such systems, as well as detailing alternative approaches. It explains how to create statistical registers, how to implement quality assurance, and the use of IT systems for register-based statistics. Further to this, clear details are given about the practicalities of implementing such statistical methods, such as protection of privacy and the coordination and coherence of such an undertaking. Thi

  8. Nonparametric Statistics Test Software Package.

    Science.gov (United States)

    1983-09-01

    25 I1l,lCELL WRITE (NCF,12 ) IvE (I ,RCCT(I) 122 FORMAT(IlXt 3(H5 9 1) IF( IeLT *NCELL) WRITE (NOF1123 J PARTV(I1J 123 FORMAT( Xll----’,FIo.3J 25 CONT...the user’s entries. Its purpose is to write two types of files needed by the program Crunch: the data file, and the option file. 211 Iuill rateLchiavar...data file and communicate the choice of test and test parameters to Crunch. After a data file is written, Lochinvar prompts the writing of the

  9. Permutation statistical methods an integrated approach

    CERN Document Server

    Berry, Kenneth J; Johnston, Janis E

    2016-01-01

    This research monograph provides a synthesis of a number of statistical tests and measures, which, at first consideration, appear disjoint and unrelated. Numerous comparisons of permutation and classical statistical methods are presented, and the two methods are compared via probability values and, where appropriate, measures of effect size. Permutation statistical methods, compared to classical statistical methods, do not rely on theoretical distributions, avoid the usual assumptions of normality and homogeneity of variance, and depend only on the data at hand. This text takes a unique approach to explaining statistics by integrating a large variety of statistical methods, and establishing the rigor of a topic that to many may seem to be a nascent field in statistics. This topic is new in that it took modern computing power to make permutation methods available to people working in the mainstream of research. This research monograph addresses a statistically-informed audience, and can also easily serve as a ...

  10. Non-parametric method for separating domestic hot water heating spikes and space heating

    DEFF Research Database (Denmark)

    Bacher, Peder; de Saint-Aubain, Philip Anton; Christiansen, Lasse Engbo

    2016-01-01

    In this paper a method for separating spikes from a noisy data series, where the data change and evolve over time, is presented. The method is applied on measurements of the total heat load for a single family house. It relies on the fact that the domestic hot water heating is a process generating...

  11. A powerful nonparametric method for detecting differentially co-expressed genes: distance correlation screening and edge-count test.

    Science.gov (United States)

    Zhang, Qingyang

    2018-05-16

    Differential co-expression analysis, as a complement of differential expression analysis, offers significant insights into the changes in molecular mechanism of different phenotypes. A prevailing approach to detecting differentially co-expressed genes is to compare Pearson's correlation coefficients in two phenotypes. However, due to the limitations of Pearson's correlation measure, this approach lacks the power to detect nonlinear changes in gene co-expression which is common in gene regulatory networks. In this work, a new nonparametric procedure is proposed to search differentially co-expressed gene pairs in different phenotypes from large-scale data. Our computational pipeline consisted of two main steps, a screening step and a testing step. The screening step is to reduce the search space by filtering out all the independent gene pairs using distance correlation measure. In the testing step, we compare the gene co-expression patterns in different phenotypes by a recently developed edge-count test. Both steps are distribution-free and targeting nonlinear relations. We illustrate the promise of the new approach by analyzing the Cancer Genome Atlas data and the METABRIC data for breast cancer subtypes. Compared with some existing methods, the new method is more powerful in detecting nonlinear type of differential co-expressions. The distance correlation screening can greatly improve computational efficiency, facilitating its application to large data sets.

  12. Comparison of non-parametric methods for ungrouping coarsely aggregated data

    DEFF Research Database (Denmark)

    Rizzi, Silvia; Thinggaard, Mikael; Engholm, Gerda

    2016-01-01

    group at the highest ages. When histogram intervals are too coarse, information is lost and comparison between histograms with different boundaries is arduous. In these cases it is useful to estimate detailed distributions from grouped data. Methods From an extensive literature search we identify five...

  13. A contingency table approach to nonparametric testing

    CERN Document Server

    Rayner, JCW

    2000-01-01

    Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp

  14. A Non-parametric Method for Calculating Conditional Stressed Value at Risk

    Directory of Open Access Journals (Sweden)

    Kohei Marumo

    2017-01-01

    Full Text Available We consider the Value at Risk (VaR of a portfolio under stressed conditions. In practice, the stressed VaR (sVaR is commonly calculated using the data set that includes the stressed period. It tells us how much the risk amount increases if we use the stressed data set. In this paper, we consider the VaR under stress scenarios. Technically, this can be done by deriving the distribution of profit or loss conditioned on the value of risk factors. We use two methods; the one that uses the linear model and the one that uses the Hermite expansion discussed by Marumo and Wolff (2013, 2016. Numerical examples shows that the method using the Hermite expansion is capable of capturing the non-linear effects such as correlation collapse and volatility clustering, which are often observed in the markets.

  15. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  16. Statistical methods in physical mapping

    International Nuclear Information System (INIS)

    Nelson, D.O.

    1995-05-01

    One of the great success stories of modern molecular genetics has been the ability of biologists to isolate and characterize the genes responsible for serious inherited diseases like fragile X syndrome, cystic fibrosis and myotonic muscular dystrophy. This dissertation concentrates on constructing high-resolution physical maps. It demonstrates how probabilistic modeling and statistical analysis can aid molecular geneticists in the tasks of planning, execution, and evaluation of physical maps of chromosomes and large chromosomal regions. The dissertation is divided into six chapters. Chapter 1 provides an introduction to the field of physical mapping, describing the role of physical mapping in gene isolation and ill past efforts at mapping chromosomal regions. The next two chapters review and extend known results on predicting progress in large mapping projects. Such predictions help project planners decide between various approaches and tactics for mapping large regions of the human genome. Chapter 2 shows how probability models have been used in the past to predict progress in mapping projects. Chapter 3 presents new results, based on stationary point process theory, for progress measures for mapping projects based on directed mapping strategies. Chapter 4 describes in detail the construction of all initial high-resolution physical map for human chromosome 19. This chapter introduces the probability and statistical models involved in map construction in the context of a large, ongoing physical mapping project. Chapter 5 concentrates on one such model, the trinomial model. This chapter contains new results on the large-sample behavior of this model, including distributional results, asymptotic moments, and detection error rates. In addition, it contains an optimality result concerning experimental procedures based on the trinomial model. The last chapter explores unsolved problems and describes future work

  17. Statistical methods in physical mapping

    Energy Technology Data Exchange (ETDEWEB)

    Nelson, David O. [Univ. of California, Berkeley, CA (United States)

    1995-05-01

    One of the great success stories of modern molecular genetics has been the ability of biologists to isolate and characterize the genes responsible for serious inherited diseases like fragile X syndrome, cystic fibrosis and myotonic muscular dystrophy. This dissertation concentrates on constructing high-resolution physical maps. It demonstrates how probabilistic modeling and statistical analysis can aid molecular geneticists in the tasks of planning, execution, and evaluation of physical maps of chromosomes and large chromosomal regions. The dissertation is divided into six chapters. Chapter 1 provides an introduction to the field of physical mapping, describing the role of physical mapping in gene isolation and ill past efforts at mapping chromosomal regions. The next two chapters review and extend known results on predicting progress in large mapping projects. Such predictions help project planners decide between various approaches and tactics for mapping large regions of the human genome. Chapter 2 shows how probability models have been used in the past to predict progress in mapping projects. Chapter 3 presents new results, based on stationary point process theory, for progress measures for mapping projects based on directed mapping strategies. Chapter 4 describes in detail the construction of all initial high-resolution physical map for human chromosome 19. This chapter introduces the probability and statistical models involved in map construction in the context of a large, ongoing physical mapping project. Chapter 5 concentrates on one such model, the trinomial model. This chapter contains new results on the large-sample behavior of this model, including distributional results, asymptotic moments, and detection error rates. In addition, it contains an optimality result concerning experimental procedures based on the trinomial model. The last chapter explores unsolved problems and describes future work.

  18. Statistical learning methods: Basics, control and performance

    Energy Technology Data Exchange (ETDEWEB)

    Zimmermann, J. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de

    2006-04-01

    The basics of statistical learning are reviewed with a special emphasis on general principles and problems for all different types of learning methods. Different aspects of controlling these methods in a physically adequate way will be discussed. All principles and guidelines will be exercised on examples for statistical learning methods in high energy and astrophysics. These examples prove in addition that statistical learning methods very often lead to a remarkable performance gain compared to the competing classical algorithms.

  19. Statistical learning methods: Basics, control and performance

    International Nuclear Information System (INIS)

    Zimmermann, J.

    2006-01-01

    The basics of statistical learning are reviewed with a special emphasis on general principles and problems for all different types of learning methods. Different aspects of controlling these methods in a physically adequate way will be discussed. All principles and guidelines will be exercised on examples for statistical learning methods in high energy and astrophysics. These examples prove in addition that statistical learning methods very often lead to a remarkable performance gain compared to the competing classical algorithms

  20. Multivariate statistical methods a primer

    CERN Document Server

    Manly, Bryan FJ

    2004-01-01

    THE MATERIAL OF MULTIVARIATE ANALYSISExamples of Multivariate DataPreview of Multivariate MethodsThe Multivariate Normal DistributionComputer ProgramsGraphical MethodsChapter SummaryReferencesMATRIX ALGEBRAThe Need for Matrix AlgebraMatrices and VectorsOperations on MatricesMatrix InversionQuadratic FormsEigenvalues and EigenvectorsVectors of Means and Covariance MatricesFurther Reading Chapter SummaryReferencesDISPLAYING MULTIVARIATE DATAThe Problem of Displaying Many Variables in Two DimensionsPlotting index VariablesThe Draftsman's PlotThe Representation of Individual Data P:ointsProfiles o

  1. Statistical data analysis using SAS intermediate statistical methods

    CERN Document Server

    Marasinghe, Mervyn G

    2018-01-01

    The aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data. The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude. Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitab...

  2. Statistical methods for nuclear material management

    Energy Technology Data Exchange (ETDEWEB)

    Bowen W.M.; Bennett, C.A. (eds.)

    1988-12-01

    This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material management problems.

  3. Statistical methods for nuclear material management

    International Nuclear Information System (INIS)

    Bowen, W.M.; Bennett, C.A.

    1988-12-01

    This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material management problems

  4. Statistical Models and Methods for Lifetime Data

    CERN Document Server

    Lawless, Jerald F

    2011-01-01

    Praise for the First Edition"An indispensable addition to any serious collection on lifetime data analysis and . . . a valuable contribution to the statistical literature. Highly recommended . . ."-Choice"This is an important book, which will appeal to statisticians working on survival analysis problems."-Biometrics"A thorough, unified treatment of statistical models and methods used in the analysis of lifetime data . . . this is a highly competent and agreeable statistical textbook."-Statistics in MedicineThe statistical analysis of lifetime or response time data is a key tool in engineering,

  5. Stochastic semi-nonparametric frontier estimation of electricity distribution networks: Application of the StoNED method in the Finnish regulatory model

    International Nuclear Information System (INIS)

    Kuosmanen, Timo

    2012-01-01

    Electricity distribution network is a prime example of a natural local monopoly. In many countries, electricity distribution is regulated by the government. Many regulators apply frontier estimation techniques such as data envelopment analysis (DEA) or stochastic frontier analysis (SFA) as an integral part of their regulatory framework. While more advanced methods that combine nonparametric frontier with stochastic error term are known in the literature, in practice, regulators continue to apply simplistic methods. This paper reports the main results of the project commissioned by the Finnish regulator for further development of the cost frontier estimation in their regulatory framework. The key objectives of the project were to integrate a stochastic SFA-style noise term to the nonparametric, axiomatic DEA-style cost frontier, and to take the heterogeneity of firms and their operating environments better into account. To achieve these objectives, a new method called stochastic nonparametric envelopment of data (StoNED) was examined. Based on the insights and experiences gained in the empirical analysis using the real data of the regulated networks, the Finnish regulator adopted the StoNED method in use from 2012 onwards.

  6. Bayesian Nonparametric Longitudinal Data Analysis.

    Science.gov (United States)

    Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen

    2016-01-01

    Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.

  7. Multivariate statistical methods a first course

    CERN Document Server

    Marcoulides, George A

    2014-01-01

    Multivariate statistics refer to an assortment of statistical methods that have been developed to handle situations in which multiple variables or measures are involved. Any analysis of more than two variables or measures can loosely be considered a multivariate statistical analysis. An introductory text for students learning multivariate statistical methods for the first time, this book keeps mathematical details to a minimum while conveying the basic principles. One of the principal strategies used throughout the book--in addition to the presentation of actual data analyses--is poin

  8. Advanced statistical methods in data science

    CERN Document Server

    Chen, Jiahua; Lu, Xuewen; Yi, Grace; Yu, Hao

    2016-01-01

    This book gathers invited presentations from the 2nd Symposium of the ICSA- CANADA Chapter held at the University of Calgary from August 4-6, 2015. The aim of this Symposium was to promote advanced statistical methods in big-data sciences and to allow researchers to exchange ideas on statistics and data science and to embraces the challenges and opportunities of statistics and data science in the modern world. It addresses diverse themes in advanced statistical analysis in big-data sciences, including methods for administrative data analysis, survival data analysis, missing data analysis, high-dimensional and genetic data analysis, longitudinal and functional data analysis, the design and analysis of studies with response-dependent and multi-phase designs, time series and robust statistics, statistical inference based on likelihood, empirical likelihood and estimating functions. The editorial group selected 14 high-quality presentations from this successful symposium and invited the presenters to prepare a fu...

  9. Statistical methods and challenges in connectome genetics

    KAUST Repository

    Pluta, Dustin; Yu, Zhaoxia; Shen, Tong; Chen, Chuansheng; Xue, Gui; Ombao, Hernando

    2018-01-01

    The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some

  10. Statistical methods in personality assessment research.

    Science.gov (United States)

    Schinka, J A; LaLone, L; Broeckel, J A

    1997-06-01

    Emerging models of personality structure and advances in the measurement of personality and psychopathology suggest that research in personality and personality assessment has entered a stage of advanced development, in this article we examine whether researchers in these areas have taken advantage of new and evolving statistical procedures. We conducted a review of articles published in the Journal of Personality, Assessment during the past 5 years. Of the 449 articles that included some form of data analysis, 12.7% used only descriptive statistics, most employed only univariate statistics, and fewer than 10% used multivariate methods of data analysis. We discuss the cost of using limited statistical methods, the possible reasons for the apparent reluctance to employ advanced statistical procedures, and potential solutions to this technical shortcoming.

  11. Estimation from PET data of transient changes in dopamine concentration induced by alcohol: support for a non-parametric signal estimation method

    Energy Technology Data Exchange (ETDEWEB)

    Constantinescu, C C; Yoder, K K; Normandin, M D; Morris, E D [Department of Radiology, Indiana University School of Medicine, Indianapolis, IN (United States); Kareken, D A [Department of Neurology, Indiana University School of Medicine, Indianapolis, IN (United States); Bouman, C A [Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN (United States); O' Connor, S J [Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN (United States)], E-mail: emorris@iupui.edu

    2008-03-07

    We previously developed a model-independent technique (non-parametric ntPET) for extracting the transient changes in neurotransmitter concentration from paired (rest and activation) PET studies with a receptor ligand. To provide support for our method, we introduced three hypotheses of validation based on work by Endres and Carson (1998 J. Cereb. Blood Flow Metab. 18 1196-210) and Yoder et al (2004 J. Nucl. Med. 45 903-11), and tested them on experimental data. All three hypotheses describe relationships between the estimated free (synaptic) dopamine curves (F{sup DA}(t)) and the change in binding potential ({delta}BP). The veracity of the F{sup DA}(t) curves recovered by nonparametric ntPET is supported when the data adhere to the following hypothesized behaviors: (1) {delta}BP should decline with increasing DA peak time, (2) {delta}BP should increase as the strength of the temporal correlation between F{sup DA}(t) and the free raclopride (F{sup RAC}(t)) curve increases, (3) {delta}BP should decline linearly with the effective weighted availability of the receptor sites. We analyzed regional brain data from 8 healthy subjects who received two [{sup 11}C]raclopride scans: one at rest, and one during which unanticipated IV alcohol was administered to stimulate dopamine release. For several striatal regions, nonparametric ntPET was applied to recover F{sup DA}(t), and binding potential values were determined. Kendall rank-correlation analysis confirmed that the F{sup DA}(t) data followed the expected trends for all three validation hypotheses. Our findings lend credence to our model-independent estimates of F{sup DA}(t). Application of nonparametric ntPET may yield important insights into how alterations in timing of dopaminergic neurotransmission are involved in the pathologies of addiction and other psychiatric disorders.

  12. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  13. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  14. Nonparametric predictive inference in reliability

    International Nuclear Information System (INIS)

    Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.

    2002-01-01

    We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere

  15. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  16. Spatial analysis statistics, visualization, and computational methods

    CERN Document Server

    Oyana, Tonny J

    2015-01-01

    An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...

  17. Workshop on Analytical Methods in Statistics

    CERN Document Server

    Jurečková, Jana; Maciak, Matúš; Pešta, Michal

    2017-01-01

    This volume collects authoritative contributions on analytical methods and mathematical statistics. The methods presented include resampling techniques; the minimization of divergence; estimation theory and regression, eventually under shape or other constraints or long memory; and iterative approximations when the optimal solution is difficult to achieve. It also investigates probability distributions with respect to their stability, heavy-tailness, Fisher information and other aspects, both asymptotically and non-asymptotically. The book not only presents the latest mathematical and statistical methods and their extensions, but also offers solutions to real-world problems including option pricing. The selected, peer-reviewed contributions were originally presented at the workshop on Analytical Methods in Statistics, AMISTAT 2015, held in Prague, Czech Republic, November 10-13, 2015.

  18. Nonparametric analysis of blocked ordered categories data: some examples revisited

    Directory of Open Access Journals (Sweden)

    O. Thas

    2006-08-01

    Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.

  19. Statistical Methods for Stochastic Differential Equations

    CERN Document Server

    Kessler, Mathieu; Sorensen, Michael

    2012-01-01

    The seventh volume in the SemStat series, Statistical Methods for Stochastic Differential Equations presents current research trends and recent developments in statistical methods for stochastic differential equations. Written to be accessible to both new students and seasoned researchers, each self-contained chapter starts with introductions to the topic at hand and builds gradually towards discussing recent research. The book covers Wiener-driven equations as well as stochastic differential equations with jumps, including continuous-time ARMA processes and COGARCH processes. It presents a sp

  20. Statistical methods for spatio-temporal systems

    CERN Document Server

    Finkenstadt, Barbel

    2006-01-01

    Statistical Methods for Spatio-Temporal Systems presents current statistical research issues on spatio-temporal data modeling and will promote advances in research and a greater understanding between the mechanistic and the statistical modeling communities.Contributed by leading researchers in the field, each self-contained chapter starts with an introduction of the topic and progresses to recent research results. Presenting specific examples of epidemic data of bovine tuberculosis, gastroenteric disease, and the U.K. foot-and-mouth outbreak, the first chapter uses stochastic models, such as point process models, to provide the probabilistic backbone that facilitates statistical inference from data. The next chapter discusses the critical issue of modeling random growth objects in diverse biological systems, such as bacteria colonies, tumors, and plant populations. The subsequent chapter examines data transformation tools using examples from ecology and air quality data, followed by a chapter on space-time co...

  1. Statistical methods and challenges in connectome genetics

    KAUST Repository

    Pluta, Dustin

    2018-03-12

    The study of genetic influences on brain connectivity, known as connectome genetics, is an exciting new direction of research in imaging genetics. We here review recent results and current statistical methods in this area, and discuss some of the persistent challenges and possible directions for future work.

  2. Statistic methods for searching inundated radioactive entities

    International Nuclear Information System (INIS)

    Dubasov, Yu.V.; Krivokhatskij, A.S.; Khramov, N.N.

    1993-01-01

    The problem of searching flooded radioactive object in a present area was considered. Various models of the searching route plotting are discussed. It is shown that spiral route by random points from the centre of the area examined is the most efficient one. The conclusion is made that, when searching flooded radioactive objects, it is advisable to use multidimensional statistical methods of classification

  3. Application of Turchin's method of statistical regularization

    Science.gov (United States)

    Zelenyi, Mikhail; Poliakova, Mariia; Nozik, Alexander; Khudyakov, Alexey

    2018-04-01

    During analysis of experimental data, one usually needs to restore a signal after it has been convoluted with some kind of apparatus function. According to Hadamard's definition this problem is ill-posed and requires regularization to provide sensible results. In this article we describe an implementation of the Turchin's method of statistical regularization based on the Bayesian approach to the regularization strategy.

  4. Statistical Methods for Unusual Count Data

    DEFF Research Database (Denmark)

    Guthrie, Katherine A.; Gammill, Hilary S.; Kamper-Jørgensen, Mads

    2016-01-01

    microchimerism data present challenges for statistical analysis, including a skewed distribution, excess zero values, and occasional large values. Methods for comparing microchimerism levels across groups while controlling for covariates are not well established. We compared statistical models for quantitative...... microchimerism values, applied to simulated data sets and 2 observed data sets, to make recommendations for analytic practice. Modeling the level of quantitative microchimerism as a rate via Poisson or negative binomial model with the rate of detection defined as a count of microchimerism genome equivalents per...

  5. On Rigorous Drought Assessment Using Daily Time Scale: Non-Stationary Frequency Analyses, Revisited Concepts, and a New Method to Yield Non-Parametric Indices

    Directory of Open Access Journals (Sweden)

    Charles Onyutha

    2017-10-01

    Full Text Available Some of the problems in drought assessments are that: analyses tend to focus on coarse temporal scales, many of the methods yield skewed indices, a few terminologies are ambiguously used, and analyses comprise an implicit assumption that the observations come from a stationary process. To solve these problems, this paper introduces non-stationary frequency analyses of quantiles. How to use non-parametric rescaling to obtain robust indices that are not (or minimally skewed is also introduced. To avoid ambiguity, some concepts on, e.g., incidence, extremity, etc., were revisited through shift from monthly to daily time scale. Demonstrations on the introduced methods were made using daily flow and precipitation insufficiency (precipitation minus potential evapotranspiration from the Blue Nile basin in Africa. Results show that, when a significant trend exists in extreme events, stationarity-based quantiles can be far different from those when non-stationarity is considered. The introduced non-parametric indices were found to closely agree with the well-known standardized precipitation evapotranspiration indices in many aspects but skewness. Apart from revisiting some concepts, the advantages of the use of fine instead of coarse time scales in drought assessment were given. The links for obtaining freely downloadable tools on how to implement the introduced methods were provided.

  6. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  7. Nonparametric functional mapping of quantitative trait loci.

    Science.gov (United States)

    Yang, Jie; Wu, Rongling; Casella, George

    2009-03-01

    Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.

  8. Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain

    Science.gov (United States)

    Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young

    2010-01-01

    Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071

  9. The statistical process control methods - SPC

    Directory of Open Access Journals (Sweden)

    Floreková Ľubica

    1998-03-01

    Full Text Available Methods of statistical evaluation of quality – SPC (item 20 of the documentation system of quality control of ISO norm, series 900 of various processes, products and services belong amongst basic qualitative methods that enable us to analyse and compare data pertaining to various quantitative parameters. Also they enable, based on the latter, to propose suitable interventions with the aim of improving these processes, products and services. Theoretical basis and applicatibily of the principles of the: - diagnostics of a cause and effects, - Paret analysis and Lorentz curve, - number distribution and frequency curves of random variable distribution, - Shewhart regulation charts, are presented in the contribution.

  10. Statistical methods towards more efficient infiltration measurements.

    Science.gov (United States)

    Franz, T; Krebs, P

    2006-01-01

    A comprehensive knowledge about the infiltration situation in a catchment is required for operation and maintenance. Due to the high expenditures, an optimisation of necessary measurement campaigns is essential. Methods based on multivariate statistics were developed to improve the information yield of measurements by identifying appropriate gauge locations. The methods have a high degree of freedom against data needs. They were successfully tested on real and artificial data. For suitable catchments, it is estimated that the optimisation potential amounts up to 30% accuracy improvement compared to nonoptimised gauge distributions. Beside this, a correlation between independent reach parameters and dependent infiltration rates could be identified, which is not dominated by the groundwater head.

  11. Mathematical methods in quantum and statistical mechanics

    International Nuclear Information System (INIS)

    Fishman, L.

    1977-01-01

    The mathematical structure and closed-form solutions pertaining to several physical problems in quantum and statistical mechanics are examined in some detail. The J-matrix method, introduced previously for s-wave scattering and based upon well-established Hilbert Space theory and related generalized integral transformation techniques, is extended to treat the lth partial wave kinetic energy and Coulomb Hamiltonians within the context of square integrable (L 2 ), Laguerre (Slater), and oscillator (Gaussian) basis sets. The theory of relaxation in statistical mechanics within the context of the theory of linear integro-differential equations of the Master Equation type and their corresponding Markov processes is examined. Several topics of a mathematical nature concerning various computational aspects of the L 2 approach to quantum scattering theory are discussed

  12. Nonparametric Transfer Function Models

    Science.gov (United States)

    Liu, Jun M.; Chen, Rong; Yao, Qiwei

    2009-01-01

    In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584

  13. Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.

    Science.gov (United States)

    Deshwar, Amit G; Vembu, Shankar; Morris, Quaid

    2015-01-01

    Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…

  14. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    Science.gov (United States)

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  15. Statistical methods for assessment of blend homogeneity

    DEFF Research Database (Denmark)

    Madsen, Camilla

    2002-01-01

    In this thesis the use of various statistical methods to address some of the problems related to assessment of the homogeneity of powder blends in tablet production is discussed. It is not straight forward to assess the homogeneity of a powder blend. The reason is partly that in bulk materials......, it is shown how to set up parametric acceptance criteria for the batch that gives a high confidence that future samples with a probability larger than a specified value will pass the USP threeclass criteria. Properties and robustness of proposed changes to the USP test for content uniformity are investigated...

  16. Nonparametric inference of network structure and dynamics

    Science.gov (United States)

    Peixoto, Tiago P.

    The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among

  17. Identifying User Profiles from Statistical Grouping Methods

    Directory of Open Access Journals (Sweden)

    Francisco Kelsen de Oliveira

    2018-02-01

    Full Text Available This research aimed to group users into subgroups according to their levels of knowledge about technology. Statistical hierarchical and non-hierarchical clustering methods were studied, compared and used in the creations of the subgroups from the similarities of the skill levels with these users’ technology. The research sample consisted of teachers who answered online questionnaires about their skills with the use of software and hardware with educational bias. The statistical methods of grouping were performed and showed the possibilities of groupings of the users. The analyses of these groups allowed to identify the common characteristics among the individuals of each subgroup. Therefore, it was possible to define two subgroups of users, one with skill in technology and another with skill with technology, so that the partial results of the research showed two main algorithms for grouping with 92% similarity in the formation of groups of users with skill with technology and the other with little skill, confirming the accuracy of the techniques of discrimination against individuals.

  18. Statistical sampling method for releasing decontaminated vehicles

    International Nuclear Information System (INIS)

    Lively, J.W.; Ware, J.A.

    1996-01-01

    Earth moving vehicles (e.g., dump trucks, belly dumps) commonly haul radiologically contaminated materials from a site being remediated to a disposal site. Traditionally, each vehicle must be surveyed before being released. The logistical difficulties of implementing the traditional approach on a large scale demand that an alternative be devised. A statistical method (MIL-STD-105E, open-quotes Sampling Procedures and Tables for Inspection by Attributesclose quotes) for assessing product quality from a continuous process was adapted to the vehicle decontamination process. This method produced a sampling scheme that automatically compensates and accommodates fluctuating batch sizes and changing conditions without the need to modify or rectify the sampling scheme in the field. Vehicles are randomly selected (sampled) upon completion of the decontamination process to be surveyed for residual radioactive surface contamination. The frequency of sampling is based on the expected number of vehicles passing through the decontamination process in a given period and the confidence level desired. This process has been successfully used for 1 year at the former uranium mill site in Monticello, Utah (a CERCLA regulated clean-up site). The method forces improvement in the quality of the decontamination process and results in a lower likelihood that vehicles exceeding the surface contamination standards are offered for survey. Implementation of this statistical sampling method on Monticello Projects has resulted in more efficient processing of vehicles through decontamination and radiological release, saved hundreds of hours of processing time, provided a high level of confidence that release limits are met, and improved the radiological cleanliness of vehicles leaving the controlled site

  19. Statistical methods for estimating normal blood chemistry ranges and variance in rainbow trout (Salmo gairdneri), Shasta Strain

    Science.gov (United States)

    Wedemeyer, Gary A.; Nelson, Nancy C.

    1975-01-01

    Gaussian and nonparametric (percentile estimate and tolerance interval) statistical methods were used to estimate normal ranges for blood chemistry (bicarbonate, bilirubin, calcium, hematocrit, hemoglobin, magnesium, mean cell hemoglobin concentration, osmolality, inorganic phosphorus, and pH for juvenile rainbow (Salmo gairdneri, Shasta strain) trout held under defined environmental conditions. The percentile estimate and Gaussian methods gave similar normal ranges, whereas the tolerance interval method gave consistently wider ranges for all blood variables except hemoglobin. If the underlying frequency distribution is unknown, the percentile estimate procedure would be the method of choice.

  20. Statistical Software for State Space Methods

    Directory of Open Access Journals (Sweden)

    Jacques J. F. Commandeur

    2011-05-01

    Full Text Available In this paper we review the state space approach to time series analysis and establish the notation that is adopted in this special volume of the Journal of Statistical Software. We first provide some background on the history of state space methods for the analysis of time series. This is followed by a concise overview of linear Gaussian state space analysis including the modelling framework and appropriate estimation methods. We discuss the important class of unobserved component models which incorporate a trend, a seasonal, a cycle, and fixed explanatory and intervention variables for the univariate and multivariate analysis of time series. We continue the discussion by presenting methods for the computation of different estimates for the unobserved state vector: filtering, prediction, and smoothing. Estimation approaches for the other parameters in the model are also considered. Next, we discuss how the estimation procedures can be used for constructing confidence intervals, detecting outlier observations and structural breaks, and testing model assumptions of residual independence, homoscedasticity, and normality. We then show how ARIMA and ARIMA components models fit in the state space framework to time series analysis. We also provide a basic introduction for non-Gaussian state space models. Finally, we present an overview of the software tools currently available for the analysis of time series with state space methods as they are discussed in the other contributions to this special volume.

  1. Statistics Anxiety and Business Statistics: The International Student

    Science.gov (United States)

    Bell, James A.

    2008-01-01

    Does the international student suffer from statistics anxiety? To investigate this, the Statistics Anxiety Rating Scale (STARS) was administered to sixty-six beginning statistics students, including twelve international students and fifty-four domestic students. Due to the small number of international students, nonparametric methods were used to…

  2. Application of pedagogy reflective in statistical methods course and practicum statistical methods

    Science.gov (United States)

    Julie, Hongki

    2017-08-01

    Subject Elementary Statistics, Statistical Methods and Statistical Methods Practicum aimed to equip students of Mathematics Education about descriptive statistics and inferential statistics. The students' understanding about descriptive and inferential statistics were important for students on Mathematics Education Department, especially for those who took the final task associated with quantitative research. In quantitative research, students were required to be able to present and describe the quantitative data in an appropriate manner, to make conclusions from their quantitative data, and to create relationships between independent and dependent variables were defined in their research. In fact, when students made their final project associated with quantitative research, it was not been rare still met the students making mistakes in the steps of making conclusions and error in choosing the hypothetical testing process. As a result, they got incorrect conclusions. This is a very fatal mistake for those who did the quantitative research. There were some things gained from the implementation of reflective pedagogy on teaching learning process in Statistical Methods and Statistical Methods Practicum courses, namely: 1. Twenty two students passed in this course and and one student did not pass in this course. 2. The value of the most accomplished student was A that was achieved by 18 students. 3. According all students, their critical stance could be developed by them, and they could build a caring for each other through a learning process in this course. 4. All students agreed that through a learning process that they undergo in the course, they can build a caring for each other.

  3. Single versus mixture Weibull distributions for nonparametric satellite reliability

    International Nuclear Information System (INIS)

    Castet, Jean-Francois; Saleh, Joseph H.

    2010-01-01

    Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.

  4. Technical Topic 3.2.2.d Bayesian and Non-Parametric Statistics: Integration of Neural Networks with Bayesian Networks for Data Fusion and Predictive Modeling

    Science.gov (United States)

    2016-05-31

    Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2.d Bayesian and Non- parametric Statistics: Integration of Neural...Transfer N/A Number of graduating undergraduates who achieved a 3.5 GPA to 4.0 (4.0 max scale ): Number of graduating undergraduates funded by a DoD funded

  5. Statistical Methods and Software for the Analysis of Occupational Exposure Data with Non-detectable Values

    Energy Technology Data Exchange (ETDEWEB)

    Frome, EL

    2005-09-20

    Environmental exposure measurements are, in general, positive and may be subject to left censoring; i.e,. the measured value is less than a ''detection limit''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. Parametric methods used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level, an upper percentile, and the exceedance fraction are used to characterize exposure levels, and confidence limits are used to describe the uncertainty in these estimates. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on an upper percentile (i.e., the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly enhanced the availability of high-quality nonproprietary (open source) software that serves as the basis for implementing the methods in this paper.

  6. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Science.gov (United States)

    Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.

    2018-03-01

    We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  7. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Directory of Open Access Journals (Sweden)

    Jinchao Feng

    2018-03-01

    Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  8. A Structural Labor Supply Model with Nonparametric Preferences

    NARCIS (Netherlands)

    van Soest, A.H.O.; Das, J.W.M.; Gong, X.

    2000-01-01

    Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained

  9. The Monte Carlo method the method of statistical trials

    CERN Document Server

    Shreider, YuA

    1966-01-01

    The Monte Carlo Method: The Method of Statistical Trials is a systematic account of the fundamental concepts and techniques of the Monte Carlo method, together with its range of applications. Some of these applications include the computation of definite integrals, neutron physics, and in the investigation of servicing processes. This volume is comprised of seven chapters and begins with an overview of the basic features of the Monte Carlo method and typical examples of its application to simple problems in computational mathematics. The next chapter examines the computation of multi-dimensio

  10. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    Science.gov (United States)

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  11. On two methods of statistical image analysis

    NARCIS (Netherlands)

    Missimer, J; Knorr, U; Maguire, RP; Herzog, H; Seitz, RJ; Tellman, L; Leenders, K.L.

    1999-01-01

    The computerized brain atlas (CBA) and statistical parametric mapping (SPM) are two procedures for voxel-based statistical evaluation of PET activation studies. Each includes spatial standardization of image volumes, computation of a statistic, and evaluation of its significance. In addition,

  12. Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2010-12-01

    Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.

  13. Statistical methods for astronomical data analysis

    CERN Document Server

    Chattopadhyay, Asis Kumar

    2014-01-01

    This book introduces “Astrostatistics” as a subject in its own right with rewarding examples, including work by the authors with galaxy and Gamma Ray Burst data to engage the reader. This includes a comprehensive blending of Astrophysics and Statistics. The first chapter’s coverage of preliminary concepts and terminologies for astronomical phenomenon will appeal to both Statistics and Astrophysics readers as helpful context. Statistics concepts covered in the book provide a methodological framework. A unique feature is the inclusion of different possible sources of astronomical data, as well as software packages for converting the raw data into appropriate forms for data analysis. Readers can then use the appropriate statistical packages for their particular data analysis needs. The ideas of statistical inference discussed in the book help readers determine how to apply statistical tests. The authors cover different applications of statistical techniques already developed or specifically introduced for ...

  14. Seasonal UK Drought Forecasting using Statistical Methods

    Science.gov (United States)

    Richardson, Doug; Fowler, Hayley; Kilsby, Chris; Serinaldi, Francesco

    2016-04-01

    In the UK drought is a recurrent feature of climate with potentially large impacts on public water supply. Water companies' ability to mitigate the impacts of drought by managing diminishing availability depends on forward planning and it would be extremely valuable to improve forecasts of drought on monthly to seasonal time scales. By focusing on statistical forecasting methods, this research aims to provide techniques that are simpler, faster and computationally cheaper than physically based models. In general, statistical forecasting is done by relating the variable of interest (some hydro-meteorological variable such as rainfall or streamflow, or a drought index) to one or more predictors via some formal dependence. These predictors are generally antecedent values of the response variable or external factors such as teleconnections. A candidate model is Generalised Additive Models for Location, Scale and Shape parameters (GAMLSS). GAMLSS is a very flexible class allowing for more general distribution functions (e.g. highly skewed and/or kurtotic distributions) and the modelling of not just the location parameter but also the scale and shape parameters. Additionally GAMLSS permits the forecasting of an entire distribution, allowing the output to be assessed in probabilistic terms rather than simply the mean and confidence intervals. Exploratory analysis of the relationship between long-memory processes (e.g. large-scale atmospheric circulation patterns, sea surface temperatures and soil moisture content) and drought should result in the identification of suitable predictors to be included in the forecasting model, and further our understanding of the drivers of UK drought.

  15. Quantal Response: Nonparametric Modeling

    Science.gov (United States)

    2017-01-01

    capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log

  16. Benchmark of the non-parametric Bayesian deconvolution method implemented in the SINBAD code for X/γ rays spectra processing

    Energy Technology Data Exchange (ETDEWEB)

    Rohée, E. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Coulon, R., E-mail: romain.coulon@cea.fr [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Carrel, F. [CEA, LIST, Laboratoire Capteurs et Architectures Electroniques, F-91191 Gif-sur-Yvette (France); Dautremer, T.; Barat, E.; Montagu, T. [CEA, LIST, Laboratoire de Modélisation et Simulation des Systèmes, F-91191 Gif-sur-Yvette (France); Normand, S. [CEA, DAM, Le Ponant, DPN/STXN, F-75015 Paris (France); Jammes, C. [CEA, DEN, Cadarache, DER/SPEx/LDCI, F-13108 Saint-Paul-lez-Durance (France)

    2016-11-11

    Radionuclide identification and quantification are a serious concern for many applications as for in situ monitoring at nuclear facilities, laboratory analysis, special nuclear materials detection, environmental monitoring, and waste measurements. High resolution gamma-ray spectrometry based on high purity germanium diode detectors is the best solution available for isotopic identification. Over the last decades, methods have been developed to improve gamma spectra analysis. However, some difficulties remain in the analysis when full energy peaks are folded together with high ratio between their amplitudes, and when the Compton background is much larger compared to the signal of a single peak. In this context, this study deals with the comparison between a conventional analysis based on “iterative peak fitting deconvolution” method and a “nonparametric Bayesian deconvolution” approach developed by the CEA LIST and implemented into the SINBAD code. The iterative peak fit deconvolution is used in this study as a reference method largely validated by industrial standards to unfold complex spectra from HPGe detectors. Complex cases of spectra are studied from IAEA benchmark protocol tests and with measured spectra. The SINBAD code shows promising deconvolution capabilities compared to the conventional method without any expert parameter fine tuning.

  17. Comparação de duas metodologias de amostragem atmosférica com ferramenta estatística não paramétrica Comparison of two atmospheric sampling methodologies with non-parametric statistical tools

    Directory of Open Access Journals (Sweden)

    Maria João Nunes

    2005-03-01

    Full Text Available In atmospheric aerosol sampling, it is inevitable that the air that carries particles is in motion, as a result of both externally driven wind and the sucking action of the sampler itself. High or low air flow sampling speeds may lead to significant particle size bias. The objective of this work is the validation of measurements enabling the comparison of species concentration from both air flow sampling techniques. The presence of several outliers and increase of residuals with concentration becomes obvious, requiring non-parametric methods, recommended for the handling of data which may not be normally distributed. This way, conversion factors are obtained for each of the various species under study using Kendall regression.

  18. Tips and Tricks for Successful Application of Statistical Methods to Biological Data.

    Science.gov (United States)

    Schlenker, Evelyn

    2016-01-01

    This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.

  19. A comparison of selected parametric and non-parametric imputation methods for estimating forest biomass and basal area

    Science.gov (United States)

    Donald Gagliasso; Susan Hummel; Hailemariam. Temesgen

    2014-01-01

    Various methods have been used to estimate the amount of above ground forest biomass across landscapes and to create biomass maps for specific stands or pixels across ownership or project areas. Without an accurate estimation method, land managers might end up with incorrect biomass estimate maps, which could lead them to make poorer decisions in their future...

  20. The choice of statistical methods for comparisons of dosimetric data in radiotherapy.

    Science.gov (United States)

    Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques

    2014-09-18

    Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods

  1. Empirical Methods for Detecting Regional Trends and Other Spatial Expressions in Antrim Shale Gas Productivity, with Implications for Improving Resource Projections Using Local Nonparametric Estimation Techniques

    Science.gov (United States)

    Coburn, T.C.; Freeman, P.A.; Attanasi, E.D.

    2012-01-01

    The primary objectives of this research were to (1) investigate empirical methods for establishing regional trends in unconventional gas resources as exhibited by historical production data and (2) determine whether or not incorporating additional knowledge of a regional trend in a suite of previously established local nonparametric resource prediction algorithms influences assessment results. Three different trend detection methods were applied to publicly available production data (well EUR aggregated to 80-acre cells) from the Devonian Antrim Shale gas play in the Michigan Basin. This effort led to the identification of a southeast-northwest trend in cell EUR values across the play that, in a very general sense, conforms to the primary fracture and structural orientations of the province. However, including this trend in the resource prediction algorithms did not lead to improved results. Further analysis indicated the existence of clustering among cell EUR values that likely dampens the contribution of the regional trend. The reason for the clustering, a somewhat unexpected result, is not completely understood, although the geological literature provides some possible explanations. With appropriate data, a better understanding of this clustering phenomenon may lead to important information about the factors and their interactions that control Antrim Shale gas production, which may, in turn, help establish a more general protocol for better estimating resources in this and other shale gas plays. ?? 2011 International Association for Mathematical Geology (outside the USA).

  2. Non-Parametric Estimation of Correlation Functions

    DEFF Research Database (Denmark)

    Brincker, Rune; Rytter, Anders; Krenk, Steen

    In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...

  3. A simple method for optimising transformation of non-parametric data: an illustration by reference to cortisol assays.

    Science.gov (United States)

    Clark, James E; Osborne, Jason W; Gallagher, Peter; Watson, Stuart

    2016-07-01

    Neuroendocrine data are typically positively skewed and rarely conform to the expectations of a Gaussian distribution. This can be a problem when attempting to analyse results within the framework of the general linear model, which relies on assumptions that residuals in the data are normally distributed. One frequently used method for handling violations of this assumption is to transform variables to bring residuals into closer alignment with assumptions (as residuals are not directly manipulated). This is often attempted through ad hoc traditional transformations such as square root, log and inverse. However, Box and Cox (Box & Cox, ) observed that these are all special cases of power transformations and proposed a more flexible method of transformation for researchers to optimise alignment with assumptions. The goal of this paper is to demonstrate the benefits of the infinitely flexible Box-Cox transformation on neuroendocrine data using syntax in spss. When applied to positively skewed data typical of neuroendocrine data, the majority (~2/3) of cases were brought into strict alignment with Gaussian distribution (i.e. a non-significant Shapiro-Wilks test). Those unable to meet this challenge showed substantial improvement in distributional properties. The biggest challenge was distributions with a high ratio of kurtosis to skewness. We discuss how these cases might be handled, and we highlight some of the broader issues associated with transformation. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  4. An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature.

    Science.gov (United States)

    Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth

    2015-10-01

    Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations

  5. Statistical learning methods in high-energy and astrophysics analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zimmermann, J. [Forschungszentrum Juelich GmbH, Zentrallabor fuer Elektronik, 52425 Juelich (Germany) and Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)]. E-mail: zimmerm@mppmu.mpg.de; Kiesling, C. [Max-Planck-Institut fuer Physik, Foehringer Ring 6, 80805 Munich (Germany)

    2004-11-21

    We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application.

  6. Development of a Research Methods and Statistics Concept Inventory

    Science.gov (United States)

    Veilleux, Jennifer C.; Chapman, Kate M.

    2017-01-01

    Research methods and statistics are core courses in the undergraduate psychology major. To assess learning outcomes, it would be useful to have a measure that assesses research methods and statistical literacy beyond course grades. In two studies, we developed and provided initial validation results for a research methods and statistical knowledge…

  7. Statistical learning methods in high-energy and astrophysics analysis

    International Nuclear Information System (INIS)

    Zimmermann, J.; Kiesling, C.

    2004-01-01

    We discuss several popular statistical learning methods used in high-energy- and astro-physics analysis. After a short motivation for statistical learning we present the most popular algorithms and discuss several examples from current research in particle- and astro-physics. The statistical learning methods are compared with each other and with standard methods for the respective application

  8. Statistical methods for determination of background levels for naturally occuring radionuclides in soil at a RCRA facility

    International Nuclear Information System (INIS)

    Guha, S.; Taylor, J.H.

    1996-01-01

    It is critical that summary statistics on background data, or background levels, be computed based on standardized and defensible statistical methods because background levels are frequently used in subsequent analyses and comparisons performed by separate analysts over time. The final background for naturally occurring radionuclide concentrations in soil at a RCRA facility, and the associated statistical methods used to estimate these concentrations, are presented. The primary objective is to describe, via a case study, the statistical methods used to estimate 95% upper tolerance limits (UTL) on radionuclide background soil data sets. A 95% UTL on background samples can be used as a screening level concentration in the absence of definitive soil cleanup criteria for naturally occurring radionuclides. The statistical methods are based exclusively on EPA guidance. This paper includes an introduction, a discussion of the analytical results for the radionuclides and a detailed description of the statistical analyses leading to the determination of 95% UTLs. Soil concentrations reported are based on validated data. Data sets are categorized as surficial soil; samples collected at depths from zero to one-half foot; and deep soil, samples collected from 3 to 5 feet. These data sets were tested for statistical outliers and underlying distributions were determined by using the chi-squared test for goodness-of-fit. UTLs for the data sets were then computed based on the percentage of non-detects and the appropriate best-fit distribution (lognormal, normal, or non-parametric). For data sets containing greater than approximately 50% nondetects, nonparametric UTLs were computed

  9. Statistical models and methods for reliability and survival analysis

    CERN Document Server

    Couallier, Vincent; Huber-Carol, Catherine; Mesbah, Mounir; Huber -Carol, Catherine; Limnios, Nikolaos; Gerville-Reache, Leo

    2013-01-01

    Statistical Models and Methods for Reliability and Survival Analysis brings together contributions by specialists in statistical theory as they discuss their applications providing up-to-date developments in methods used in survival analysis, statistical goodness of fit, stochastic processes for system reliability, amongst others. Many of these are related to the work of Professor M. Nikulin in statistics over the past 30 years. The authors gather together various contributions with a broad array of techniques and results, divided into three parts - Statistical Models and Methods, Statistical

  10. BIOMETRIC AUTHENTICATION USING NONPARAMETRIC METHODS

    OpenAIRE

    S V Sheela; K R Radhika

    2010-01-01

    The physiological and behavioral trait is employed to develop biometric authentication systems. The proposed work deals with the authentication of iris and signature based on minimum variance criteria. The iris patterns are preprocessed based on area of the connected components. The segmented image used for authentication consists of the region with large variations in the gray level values. The image region is split into quadtree components. The components with minimum variance are determine...

  11. Statistical methods of estimating mining costs

    Science.gov (United States)

    Long, K.R.

    2011-01-01

    Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.

  12. Statistical time series methods for damage diagnosis in a scale aircraft skeleton structure: loosened bolts damage scenarios

    International Nuclear Information System (INIS)

    Kopsaftopoulos, Fotis P; Fassois, Spilios D

    2011-01-01

    A comparative assessment of several vibration based statistical time series methods for Structural Health Monitoring (SHM) is presented via their application to a scale aircraft skeleton laboratory structure. A brief overview of the methods, which are either scalar or vector type, non-parametric or parametric, and pertain to either the response-only or excitation-response cases, is provided. Damage diagnosis, including both the detection and identification subproblems, is tackled via scalar or vector vibration signals. The methods' effectiveness is assessed via repeated experiments under various damage scenarios, with each scenario corresponding to the loosening of one or more selected bolts. The results of the study confirm the 'global' damage detection capability and effectiveness of statistical time series methods for SHM.

  13. Analysis of relationship between registration performance of point cloud statistical model and generation method of corresponding points

    International Nuclear Information System (INIS)

    Yamaoka, Naoto; Watanabe, Wataru; Hontani, Hidekata

    2010-01-01

    Most of the time when we construct statistical point cloud model, we need to calculate the corresponding points. Constructed statistical model will not be the same if we use different types of method to calculate the corresponding points. This article proposes the effect to statistical model of human organ made by different types of method to calculate the corresponding points. We validated the performance of statistical model by registering a surface of an organ in a 3D medical image. We compare two methods to calculate corresponding points. The first, the 'Generalized Multi-Dimensional Scaling (GMDS)', determines the corresponding points by the shapes of two curved surfaces. The second approach, the 'Entropy-based Particle system', chooses corresponding points by calculating a number of curved surfaces statistically. By these methods we construct the statistical models and using these models we conducted registration with the medical image. For the estimation, we use non-parametric belief propagation and this method estimates not only the position of the organ but also the probability density of the organ position. We evaluate how the two different types of method that calculates corresponding points affects the statistical model by change in probability density of each points. (author)

  14. A Bayesian Nonparametric Approach to Factor Analysis

    DEFF Research Database (Denmark)

    Piatek, Rémi; Papaspiliopoulos, Omiros

    2018-01-01

    This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...

  15. A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING

    OpenAIRE

    Temel, Tugrul T.

    2001-01-01

    This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.

  16. Innovative statistical methods for public health data

    CERN Document Server

    Wilson, Jeffrey

    2015-01-01

    The book brings together experts working in public health and multi-disciplinary areas to present recent issues in statistical methodological development and their applications. This timely book will impact model development and data analyses of public health research across a wide spectrum of analysis. Data and software used in the studies are available for the reader to replicate the models and outcomes. The fifteen chapters range in focus from techniques for dealing with missing data with Bayesian estimation, health surveillance and population definition and implications in applied latent class analysis, to multiple comparison and meta-analysis in public health data. Researchers in biomedical and public health research will find this book to be a useful reference, and it can be used in graduate level classes.

  17. Methods of contemporary mathematical statistical physics

    CERN Document Server

    2009-01-01

    This volume presents a collection of courses introducing the reader to the recent progress with attention being paid to laying solid grounds and developing various basic tools. An introductory chapter on lattice spin models is useful as a background for other lectures of the collection. The topics include new results on phase transitions for gradient lattice models (with introduction to the techniques of the reflection positivity), stochastic geometry reformulation of classical and quantum Ising models, the localization/delocalization transition for directed polymers. A general rigorous framework for theory of metastability is presented and particular applications in the context of Glauber and Kawasaki dynamics of lattice models are discussed. A pedagogical account of several recently discussed topics in nonequilibrium statistical mechanics with an emphasis on general principles is followed by a discussion of kinetically constrained spin models that are reflecting important peculiar features of glassy dynamic...

  18. MSD Recombination Method in Statistical Machine Translation

    Science.gov (United States)

    Gros, Jerneja Žganec

    2008-11-01

    Freely available tools and language resources were used to build the VoiceTRAN statistical machine translation (SMT) system. Various configuration variations of the system are presented and evaluated. The VoiceTRAN SMT system outperformed the baseline conventional rule-based MT system in all English-Slovenian in-domain test setups. To further increase the generalization capability of the translation model for lower-coverage out-of-domain test sentences, an "MSD-recombination" approach was proposed. This approach not only allows a better exploitation of conventional translation models, but also performs well in the more demanding translation direction; that is, into a highly inflectional language. Using this approach in the out-of-domain setup of the English-Slovenian JRC-ACQUIS task, we have achieved significant improvements in translation quality.

  19. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei

    2011-07-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.

  20. Statistical methods for categorical data analysis

    CERN Document Server

    Powers, Daniel

    2008-01-01

    This book provides a comprehensive introduction to methods and models for categorical data analysis and their applications in social science research. Companion website also available, at https://webspace.utexas.edu/dpowers/www/

  1. Statistical methods and computing for big data

    Science.gov (United States)

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay. PMID:27695593

  2. Statistical methods and computing for big data.

    Science.gov (United States)

    Wang, Chun; Chen, Ming-Hui; Schifano, Elizabeth; Wu, Jing; Yan, Jun

    2016-01-01

    Big data are data on a massive scale in terms of volume, intensity, and complexity that exceed the capacity of standard analytic tools. They present opportunities as well as challenges to statisticians. The role of computational statisticians in scientific discovery from big data analyses has been under-recognized even by peer statisticians. This article summarizes recent methodological and software developments in statistics that address the big data challenges. Methodologies are grouped into three classes: subsampling-based, divide and conquer, and online updating for stream data. As a new contribution, the online updating approach is extended to variable selection with commonly used criteria, and their performances are assessed in a simulation study with stream data. Software packages are summarized with focuses on the open source R and R packages, covering recent tools that help break the barriers of computer memory and computing power. Some of the tools are illustrated in a case study with a logistic regression for the chance of airline delay.

  3. Simple statistical methods for software engineering data and patterns

    CERN Document Server

    Pandian, C Ravindranath

    2015-01-01

    Although there are countless books on statistics, few are dedicated to the application of statistical methods to software engineering. Simple Statistical Methods for Software Engineering: Data and Patterns fills that void. Instead of delving into overly complex statistics, the book details simpler solutions that are just as effective and connect with the intuition of problem solvers.Sharing valuable insights into software engineering problems and solutions, the book not only explains the required statistical methods, but also provides many examples, review questions, and case studies that prov

  4. CATDAT - A program for parametric and nonparametric categorical data analysis user's manual, Version 1.0

    International Nuclear Information System (INIS)

    Peterson, James R.; Haas, Timothy C.; Lee, Danny C.

    2000-01-01

    Natural resource professionals are increasingly required to develop rigorous statistical models that relate environmental data to categorical responses data. Recent advances in the statistical and computing sciences have led to the development of sophisticated methods for parametric and nonparametric analysis of data with categorical responses. The statistical software package CATDAT was designed to make some of these relatively new and powerful techniques available to scientists. The CATDAT statistical package includes 4 analytical techniques: generalized logit modeling; binary classification tree; extended K-nearest neighbor classification; and modular neural network

  5. Cratering statistics on asteroids: Methods and perspectives

    Science.gov (United States)

    Chapman, C.

    2014-07-01

    Crater size-frequency distributions (SFDs) on the surfaces of solid-surfaced bodies in the solar system have provided valuable insights about planetary surface processes and about impactor populations since the first spacecraft images were obtained in the 1960s. They can be used to determine relative age differences between surficial units, to obtain absolute model ages if the impactor flux and scaling laws are understood, to assess various endogenic planetary or asteroidal processes that degrade craters or resurface units, as well as assess changes in impactor populations across the solar system and/or with time. The first asteroid SFDs were measured from Galileo images of Gaspra and Ida (cf., Chapman 2002). Despite the superficial simplicity of these studies, they are fraught with many difficulties, including confusion by secondary and/or endogenic cratering and poorly understood aspects of varying target properties (including regoliths, ejecta blankets, and nearly-zero-g rubble piles), widely varying attributes of impactors, and a host of methodological problems including recognizability of degraded craters, which is affected by illumination angle and by the ''personal equations'' of analysts. Indeed, controlled studies (Robbins et al. 2014) demonstrate crater-density differences of a factor of two or more between experienced crater counters. These inherent difficulties have been especially apparent in divergent results for Vesta from different members of the Dawn Science Team (cf. Russell et al. 2013). Indeed, they have been exacerbated by misuse of a widely available tool (Craterstats: hrscview.fu- berlin.de/craterstats.html), which incorrectly computes error bars for proper interpretation of cumulative SFDs, resulting in derived model ages specified to three significant figures and interpretations of statistically insignificant kinks. They are further exacerbated, and for other small-body crater SFDs analyzed by the Berlin group, by stubbornly adopting

  6. On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

    Directory of Open Access Journals (Sweden)

    Aaditya Ramdas

    2017-01-01

    Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

  7. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers

    Directory of Open Access Journals (Sweden)

    Stochl Jan

    2012-06-01

    Full Text Available Abstract Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1 a cross-sectional health survey (the Scottish Health Education Population Survey and 2 a general population birth cohort study (the National Child Development Study illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items we show that all items from the 12-item General Health Questionnaire (GHQ-12 – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales. An illustration of ordinal item analysis

  8. Statistical methods for handling incomplete data

    CERN Document Server

    Kim, Jae Kwang

    2013-01-01

    ""… this book nicely blends the theoretical material and its application through examples, and will be of interest to students and researchers as a textbook or a reference book. Extensive coverage of recent advances in handling missing data provides resources and guidelines for researchers and practitioners in implementing the methods in new settings. … I plan to use this as a textbook for my teaching and highly recommend it.""-Biometrics, September 2014

  9. Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in empirical research for applied health researchers.

    Science.gov (United States)

    Stochl, Jan; Jones, Peter B; Croudace, Tim J

    2012-06-11

    Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental

  10. Likelihood devices in spatial statistics

    NARCIS (Netherlands)

    Zwet, E.W. van

    1999-01-01

    One of the main themes of this thesis is the application to spatial data of modern semi- and nonparametric methods. Another, closely related theme is maximum likelihood estimation from spatial data. Maximum likelihood estimation is not common practice in spatial statistics. The method of moments

  11. PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks.

    Directory of Open Access Journals (Sweden)

    Thong Pham

    Full Text Available Preferential attachment is a stochastic process that has been proposed to explain certain topological features characteristic of complex networks from diverse domains. The systematic investigation of preferential attachment is an important area of research in network science, not only for the theoretical matter of verifying whether this hypothesized process is operative in real-world networks, but also for the practical insights that follow from knowledge of its functional form. Here we describe a maximum likelihood based estimation method for the measurement of preferential attachment in temporal complex networks. We call the method PAFit, and implement it in an R package of the same name. PAFit constitutes an advance over previous methods primarily because we based it on a nonparametric statistical framework that enables attachment kernel estimation free of any assumptions about its functional form. We show this results in PAFit outperforming the popular methods of Jeong and Newman in Monte Carlo simulations. What is more, we found that the application of PAFit to a publically available Flickr social network dataset yielded clear evidence for a deviation of the attachment kernel from the popularly assumed log-linear form. Independent of our main work, we provide a correction to a consequential error in Newman's original method which had evidently gone unnoticed since its publication over a decade ago.

  12. Statistical methods and their applications in constructional engineering

    International Nuclear Information System (INIS)

    1977-01-01

    An introduction into the basic terms of statistics is followed by a discussion of elements of the probability theory, customary discrete and continuous distributions, simulation methods, statistical supporting framework dynamics, and a cost-benefit analysis of the methods introduced. (RW) [de

  13. The choice of statistical methods for comparisons of dosimetric data in radiotherapy

    International Nuclear Information System (INIS)

    Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques

    2014-01-01

    Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman’s test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman’s rank and Kendall’s rank tests. The Friedman’s test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (−5 ± 4.4 SD) for MB and (−4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density

  14. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo; Genton, Marc G.

    2013-01-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric

  15. Online Statistics Labs in MSW Research Methods Courses: Reducing Reluctance toward Statistics

    Science.gov (United States)

    Elliott, William; Choi, Eunhee; Friedline, Terri

    2013-01-01

    This article presents results from an evaluation of an online statistics lab as part of a foundations research methods course for master's-level social work students. The article discusses factors that contribute to an environment in social work that fosters attitudes of reluctance toward learning and teaching statistics in research methods…

  16. Estimation of the limit of detection with a bootstrap-derived standard error by a partly non-parametric approach. Application to HPLC drug assays

    DEFF Research Database (Denmark)

    Linnet, Kristian

    2005-01-01

    Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...

  17. Smooth semi-nonparametric (SNP) estimation of the cumulative incidence function.

    Science.gov (United States)

    Duc, Anh Nguyen; Wolbers, Marcel

    2017-08-15

    This paper presents a novel approach to estimation of the cumulative incidence function in the presence of competing risks. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and the time to the event. The time to event distributions conditional on the event type are modeled using smooth semi-nonparametric densities. One strength of this approach is that it can handle arbitrary censoring and truncation while relying on mild parametric assumptions. A stepwise forward algorithm for model estimation and adaptive selection of smooth semi-nonparametric polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data from a clinical trial in cryptococcal meningitis. The simulations demonstrate that the proposed method frequently outperforms both parametric and nonparametric alternatives. They also support the use of 'ad hoc' asymptotic inference to derive confidence intervals. An extension to regression modeling is also presented, and its potential and challenges are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  18. Parametric and Non-Parametric System Modelling

    DEFF Research Database (Denmark)

    Nielsen, Henrik Aalborg

    1999-01-01

    the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...

  19. Nonparametric Bayes Modeling of Multivariate Categorical Data.

    Science.gov (United States)

    Dunson, David B; Xing, Chuanhua

    2012-01-01

    Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

  20. Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Zvárová, Jana

    2017-01-01

    Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  1. The estimation of the measurement results with using statistical methods

    International Nuclear Information System (INIS)

    Ukrmetrteststandard, 4, Metrologichna Str., 03680, Kyiv (Ukraine))" data-affiliation=" (State Enterprise Ukrmetrteststandard, 4, Metrologichna Str., 03680, Kyiv (Ukraine))" >Velychko, O; UkrNDIspirtbioprod, 3, Babushkina Lane, 03190, Kyiv (Ukraine))" data-affiliation=" (State Scientific Institution UkrNDIspirtbioprod, 3, Babushkina Lane, 03190, Kyiv (Ukraine))" >Gordiyenko, T

    2015-01-01

    The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed

  2. The estimation of the measurement results with using statistical methods

    Science.gov (United States)

    Velychko, O.; Gordiyenko, T.

    2015-02-01

    The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed.

  3. Statistical inference a short course

    CERN Document Server

    Panik, Michael J

    2012-01-01

    A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal

  4. Statistical methods for accurately determining criticality code bias

    International Nuclear Information System (INIS)

    Trumble, E.F.; Kimball, K.D.

    1997-01-01

    A system of statistically treating validation calculations for the purpose of determining computer code bias is provided in this paper. The following statistical treatments are described: weighted regression analysis, lower tolerance limit, lower tolerance band, and lower confidence band. These methods meet the criticality code validation requirements of ANS 8.1. 8 refs., 5 figs., 4 tabs

  5. A nonparametric mixture model for cure rate estimation.

    Science.gov (United States)

    Peng, Y; Dear, K B

    2000-03-01

    Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.

  6. A functional U-statistic method for association analysis of sequencing data.

    Science.gov (United States)

    Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

    2017-11-01

    Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.

  7. Statistics of Monte Carlo methods used in radiation transport calculation

    International Nuclear Information System (INIS)

    Datta, D.

    2009-01-01

    Radiation transport calculation can be carried out by using either deterministic or statistical methods. Radiation transport calculation based on statistical methods is basic theme of the Monte Carlo methods. The aim of this lecture is to describe the fundamental statistics required to build the foundations of Monte Carlo technique for radiation transport calculation. Lecture note is organized in the following way. Section (1) will describe the introduction of Basic Monte Carlo and its classification towards the respective field. Section (2) will describe the random sampling methods, a key component of Monte Carlo radiation transport calculation, Section (3) will provide the statistical uncertainty of Monte Carlo estimates, Section (4) will describe in brief the importance of variance reduction techniques while sampling particles such as photon, or neutron in the process of radiation transport

  8. Statistical methods for evaluating the attainment of cleanup standards

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, R.O.; Simpson, J.C.

    1992-12-01

    This document is the third volume in a series of volumes sponsored by the US Environmental Protection Agency (EPA), Statistical Policy Branch, that provide statistical methods for evaluating the attainment of cleanup Standards at Superfund sites. Volume 1 (USEPA 1989a) provides sampling designs and tests for evaluating attainment of risk-based standards for soils and solid media. Volume 2 (USEPA 1992) provides designs and tests for evaluating attainment of risk-based standards for groundwater. The purpose of this third volume is to provide statistical procedures for designing sampling programs and conducting statistical tests to determine whether pollution parameters in remediated soils and solid media at Superfund sites attain site-specific reference-based standards. This.document is written for individuals who may not have extensive training or experience with statistical methods. The intended audience includes EPA regional remedial project managers, Superfund-site potentially responsible parties, state environmental protection agencies, and contractors for these groups.

  9. Complex Data Modeling and Computationally Intensive Statistical Methods

    CERN Document Server

    Mantovan, Pietro

    2010-01-01

    The last years have seen the advent and development of many devices able to record and store an always increasing amount of complex and high dimensional data; 3D images generated by medical scanners or satellite remote sensing, DNA microarrays, real time financial data, system control datasets. The analysis of this data poses new challenging problems and requires the development of novel statistical models and computational methods, fueling many fascinating and fast growing research areas of modern statistics. The book offers a wide variety of statistical methods and is addressed to statistici

  10. Method for statistical data analysis of multivariate observations

    CERN Document Server

    Gnanadesikan, R

    1997-01-01

    A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

  11. Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models.

    Science.gov (United States)

    Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P

    1999-01-01

    Functional neuroimaging (FNI) provides experimental access to the intact living brain making it possible to study higher cognitive functions in humans. In this review and in a companion paper in this issue, we discuss some common methods used to analyse FNI data. The emphasis in both papers is on assumptions and limitations of the methods reviewed. There are several methods available to analyse FNI data indicating that none is optimal for all purposes. In order to make optimal use of the methods available it is important to know the limits of applicability. For the interpretation of FNI results it is also important to take into account the assumptions, approximations and inherent limitations of the methods used. This paper gives a brief overview over some non-inferential descriptive methods and common statistical models used in FNI. Issues relating to the complex problem of model selection are discussed. In general, proper model selection is a necessary prerequisite for the validity of the subsequent statistical inference. The non-inferential section describes methods that, combined with inspection of parameter estimates and other simple measures, can aid in the process of model selection and verification of assumptions. The section on statistical models covers approaches to global normalization and some aspects of univariate, multivariate, and Bayesian models. Finally, approaches to functional connectivity and effective connectivity are discussed. In the companion paper we review issues related to signal detection and statistical inference. PMID:10466149

  12. Analysis of Statistical Methods Currently used in Toxicology Journals.

    Science.gov (United States)

    Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min

    2014-09-01

    Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.

  13. Towards an Industrial Application of Statistical Uncertainty Analysis Methods to Multi-physical Modelling and Safety Analyses

    International Nuclear Information System (INIS)

    Zhang, Jinzhao; Segurado, Jacobo; Schneidesch, Christophe

    2013-01-01

    Since 1980's, Tractebel Engineering (TE) has being developed and applied a multi-physical modelling and safety analyses capability, based on a code package consisting of the best estimate 3D neutronic (PANTHER), system thermal hydraulic (RELAP5), core sub-channel thermal hydraulic (COBRA-3C), and fuel thermal mechanic (FRAPCON/FRAPTRAN) codes. A series of methodologies have been developed to perform and to license the reactor safety analysis and core reload design, based on the deterministic bounding approach. Following the recent trends in research and development as well as in industrial applications, TE has been working since 2010 towards the application of the statistical sensitivity and uncertainty analysis methods to the multi-physical modelling and licensing safety analyses. In this paper, the TE multi-physical modelling and safety analyses capability is first described, followed by the proposed TE best estimate plus statistical uncertainty analysis method (BESUAM). The chosen statistical sensitivity and uncertainty analysis methods (non-parametric order statistic method or bootstrap) and tool (DAKOTA) are then presented, followed by some preliminary results of their applications to FRAPCON/FRAPTRAN simulation of OECD RIA fuel rod codes benchmark and RELAP5/MOD3.3 simulation of THTF tests. (authors)

  14. Statistical Methods for Detecting and Modeling General Patterns and Relationships in Lifetime Data

    Energy Technology Data Exchange (ETDEWEB)

    Kvaloey, Jan Terje

    1999-04-01

    In this thesis, the author tries to develop methods of detecting and modeling general patterns and relationships in lifetime data. Tests with power against nonmonotonic trends and nonmonotonic co variate effects are considered, and nonparametric regression methods which allow estimation of fairly general nonlinear relationships are studied. Practical uses of some of the methods are illustrated although in a medical rather than engineering or technological context.

  15. Network structure exploration via Bayesian nonparametric models

    International Nuclear Information System (INIS)

    Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z

    2015-01-01

    Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)

  16. Nonparametric Mixture Models for Supervised Image Parcellation.

    Science.gov (United States)

    Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina

    2009-09-01

    We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.

  17. Assessment of climate change using methods of mathematic statistics and theory of probability

    International Nuclear Information System (INIS)

    Trajanoska, Lidija; Kaevski, Ivancho

    2004-01-01

    In simple terms: 'Climate' is the average of 'weather'. The Earth's weather system is a complex machine composed of coupled sub-systems (ocean, air, land, ice and the biosphere) between which energy are exchanged. The understanding and study of climate change does not only rely on the understanding of the physics of climate change but is linked to the following question: 'How we can detect change in a system that is changing all the time under its own volition'? What is even the meaning of 'change' in such a situation? The concept of 'change' we should transform into the concept of 'significant and long-term' then this re-phrasing allows for a definition in mathematical terms. Significant change in a system becomes a measure of how large an observed change is in terms of the variability one would see under 'normal' conditions. Example could be the analyses of the yearly temperature of the air and precipitations, like in this paper. A large amount of data are selected as representing the 'before' case (change) and another set of data are selected as being the 'after' case and then the average in these two cases are compared. These comparisons are in the form of 'hypothesis tests' in which one tests whether the hypothesis that there has Open no change can be rejected. Both parameter and nonparametric statistic methods are used in the theory of mathematic statistic. The most indicative changeable which show global change is an average, standard deviation and probability function distribution on examined time series. Examined meteorological series are taken like haphazard process so we can mathematic statistic applied.(Author)

  18. Brief guidelines for methods and statistics in medical research

    CERN Document Server

    Ab Rahman, Jamalludin

    2015-01-01

    This book serves as a practical guide to methods and statistics in medical research. It includes step-by-step instructions on using SPSS software for statistical analysis, as well as relevant examples to help those readers who are new to research in health and medical fields. Simple texts and diagrams are provided to help explain the concepts covered, and print screens for the statistical steps and the SPSS outputs are provided, together with interpretations and examples of how to report on findings. Brief Guidelines for Methods and Statistics in Medical Research offers a valuable quick reference guide for healthcare students and practitioners conducting research in health related fields, written in an accessible style.

  19. Fundamentals of modern statistical methods substantially improving power and accuracy

    CERN Document Server

    Wilcox, Rand R

    2001-01-01

    Conventional statistical methods have a very serious flaw They routinely miss differences among groups or associations among variables that are detected by more modern techniques - even under very small departures from normality Hundreds of journal articles have described the reasons standard techniques can be unsatisfactory, but simple, intuitive explanations are generally unavailable Improved methods have been derived, but they are far from obvious or intuitive based on the training most researchers receive Situations arise where even highly nonsignificant results become significant when analyzed with more modern methods Without assuming any prior training in statistics, Part I of this book describes basic statistical principles from a point of view that makes their shortcomings intuitive and easy to understand The emphasis is on verbal and graphical descriptions of concepts Part II describes modern methods that address the problems covered in Part I Using data from actual studies, many examples are include...

  20. Quantitative EEG Applying the Statistical Recognition Pattern Method

    DEFF Research Database (Denmark)

    Engedal, Knut; Snaedal, Jon; Hoegh, Peter

    2015-01-01

    BACKGROUND/AIM: The aim of this study was to examine the discriminatory power of quantitative EEG (qEEG) applying the statistical pattern recognition (SPR) method to separate Alzheimer's disease (AD) patients from elderly individuals without dementia and from other dementia patients. METHODS...

  1. An Overview of Short-term Statistical Forecasting Methods

    DEFF Research Database (Denmark)

    Elias, Russell J.; Montgomery, Douglas C.; Kulahci, Murat

    2006-01-01

    An overview of statistical forecasting methodology is given, focusing on techniques appropriate to short- and medium-term forecasts. Topics include basic definitions and terminology, smoothing methods, ARIMA models, regression methods, dynamic regression models, and transfer functions. Techniques...... for evaluating and monitoring forecast performance are also summarized....

  2. Hierarchical modelling for the environmental sciences statistical methods and applications

    CERN Document Server

    Clark, James S

    2006-01-01

    New statistical tools are changing the way in which scientists analyze and interpret data and models. Hierarchical Bayes and Markov Chain Monte Carlo methods for analysis provide a consistent framework for inference and prediction where information is heterogeneous and uncertain, processes are complicated, and responses depend on scale. Nowhere are these methods more promising than in the environmental sciences.

  3. Zubarev's Nonequilibrium Statistical Operator Method in the Generalized Statistics of Multiparticle Systems

    Science.gov (United States)

    Glushak, P. A.; Markiv, B. B.; Tokarchuk, M. V.

    2018-01-01

    We present a generalization of Zubarev's nonequilibrium statistical operator method based on the principle of maximum Renyi entropy. In the framework of this approach, we obtain transport equations for the basic set of parameters of the reduced description of nonequilibrium processes in a classical system of interacting particles using Liouville equations with fractional derivatives. For a classical systems of particles in a medium with a fractal structure, we obtain a non-Markovian diffusion equation with fractional spatial derivatives. For a concrete model of the frequency dependence of a memory function, we obtain generalized Kettano-type diffusion equation with the spatial and temporal fractality taken into account. We present a generalization of nonequilibrium thermofield dynamics in Zubarev's nonequilibrium statistical operator method in the framework of Renyi statistics.

  4. Nonparametric estimation of benchmark doses in environmental risk assessment

    Science.gov (United States)

    Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

    2013-01-01

    Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133

  5. Descriptive and inferential statistical methods used in burns research.

    Science.gov (United States)

    Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars

    2010-05-01

    Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals

  6. A Monte Carlo Study of the Effect of Item Characteristic Curve Estimation on the Accuracy of Three Person-Fit Statistics

    Science.gov (United States)

    St-Onge, Christina; Valois, Pierre; Abdous, Belkacem; Germain, Stephane

    2009-01-01

    To date, there have been no studies comparing parametric and nonparametric Item Characteristic Curve (ICC) estimation methods on the effectiveness of Person-Fit Statistics (PFS). The primary aim of this study was to determine if the use of ICCs estimated by nonparametric methods would increase the accuracy of item response theory-based PFS for…

  7. Statistical-mechanical entropy by the thin-layer method

    International Nuclear Information System (INIS)

    Feng, He; Kim, Sung Won

    2003-01-01

    G. Hooft first studied the statistical-mechanical entropy of a scalar field in a Schwarzschild black hole background by the brick-wall method and hinted that the statistical-mechanical entropy is the statistical origin of the Bekenstein-Hawking entropy of the black hole. However, according to our viewpoint, the statistical-mechanical entropy is only a quantum correction to the Bekenstein-Hawking entropy of the black-hole. The brick-wall method based on thermal equilibrium at a large scale cannot be applied to the cases out of equilibrium such as a nonstationary black hole. The statistical-mechanical entropy of a scalar field in a nonstationary black hole background is calculated by the thin-layer method. The condition of local equilibrium near the horizon of the black hole is used as a working postulate and is maintained for a black hole which evaporates slowly enough and whose mass is far greater than the Planck mass. The statistical-mechanical entropy is also proportional to the area of the black hole horizon. The difference from the stationary black hole is that the result relies on a time-dependent cutoff

  8. Academic Training Lecture: Statistical Methods for Particle Physics

    CERN Multimedia

    PH Department

    2012-01-01

    2, 3, 4 and 5 April 2012 Academic Training Lecture  Regular Programme from 11:00 to 12:00 -  Bldg. 222-R-001 - Filtration Plant Statistical Methods for Particle Physics by Glen Cowan (Royal Holloway) The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena.  Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties.  The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.

  9. Methods library of embedded R functions at Statistics Norway

    Directory of Open Access Journals (Sweden)

    Øyvind Langsrud

    2017-11-01

    Full Text Available Statistics Norway is modernising the production processes. An important element in this work is a library of functions for statistical computations. In principle, the functions in such a methods library can be programmed in several languages. A modernised production environment demand that these functions can be reused for different statistics products, and that they are embedded within a common IT system. The embedding should be done in such a way that the users of the methods do not need to know the underlying programming language. As a proof of concept, Statistics Norway soon has established a methods library offering a limited number of methods for macro-editing, imputation and confidentiality. This is done within an area of municipal statistics with R as the only programming language. This paper presents the details and experiences from this work. The problem of fitting real word applications to simple and strict standards is discussed and exemplified by the development of solutions to regression imputation and table suppression.

  10. Application of blended learning in teaching statistical methods

    Directory of Open Access Journals (Sweden)

    Barbara Dębska

    2012-12-01

    Full Text Available The paper presents the application of a hybrid method (blended learning - linking traditional education with on-line education to teach selected problems of mathematical statistics. This includes the teaching of the application of mathematical statistics to evaluate laboratory experimental results. An on-line statistics course was developed to form an integral part of the module ‘methods of statistical evaluation of experimental results’. The course complies with the principles outlined in the Polish National Framework of Qualifications with respect to the scope of knowledge, skills and competencies that students should have acquired at course completion. The paper presents the structure of the course and the educational content provided through multimedia lessons made accessible on the Moodle platform. Following courses which used the traditional method of teaching and courses which used the hybrid method of teaching, students test results were compared and discussed to evaluate the effectiveness of the hybrid method of teaching when compared to the effectiveness of the traditional method of teaching.

  11. Statistical Methods for Particle Physics (4/4)

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.

  12. Statistical Methods for Particle Physics (1/4)

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.

  13. Statistical Methods for Particle Physics (2/4)

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.

  14. Statistical Methods for Particle Physics (3/4)

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    The series of four lectures will introduce some of the important statistical methods used in Particle Physics, and should be particularly relevant to those involved in the analysis of LHC data. The lectures will include an introduction to statistical tests, parameter estimation, and the application of these tools to searches for new phenomena. Both frequentist and Bayesian methods will be described, with particular emphasis on treatment of systematic uncertainties. The lectures will also cover unfolding, that is, estimation of a distribution in binned form where the variable in question is subject to measurement errors.

  15. Glaucoma Monitoring in a Clinical Setting Glaucoma Progression Analysis vs Nonparametric Progression Analysis in the Groningen Longitudinal Glaucoma Study

    NARCIS (Netherlands)

    Wesselink, Christiaan; Heeg, Govert P.; Jansonius, Nomdo M.

    Objective: To compare prospectively 2 perimetric progression detection algorithms for glaucoma, the Early Manifest Glaucoma Trial algorithm (glaucoma progression analysis [GPA]) and a nonparametric algorithm applied to the mean deviation (MD) (nonparametric progression analysis [NPA]). Methods:

  16. Understanding common statistical methods, Part I: descriptive methods, probability, and continuous data.

    Science.gov (United States)

    Skinner, Carl G; Patel, Manish M; Thomas, Jerry D; Miller, Michael A

    2011-01-01

    Statistical methods are pervasive in medical research and general medical literature. Understanding general statistical concepts will enhance our ability to critically appraise the current literature and ultimately improve the delivery of patient care. This article intends to provide an overview of the common statistical methods relevant to medicine.

  17. Methods and statistics for combining motif match scores.

    Science.gov (United States)

    Bailey, T L; Gribskov, M

    1998-01-01

    Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.

  18. Nonparametric predictive inference for combining diagnostic tests with parametric copula

    Science.gov (United States)

    Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.

    2017-09-01

    Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.

  19. Advances in Statistical Methods for Substance Abuse Prevention Research

    Science.gov (United States)

    MacKinnon, David P.; Lockwood, Chondra M.

    2010-01-01

    The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467

  20. Nonparametric factor analysis of time series

    OpenAIRE

    Rodríguez-Poo, Juan M.; Linton, Oliver Bruce

    1998-01-01

    We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.

  1. Statistical methods of parameter estimation for deterministically chaotic time series

    Science.gov (United States)

    Pisarenko, V. F.; Sornette, D.

    2004-03-01

    We discuss the possibility of applying some standard statistical methods (the least-square method, the maximum likelihood method, and the method of statistical moments for estimation of parameters) to deterministically chaotic low-dimensional dynamic system (the logistic map) containing an observational noise. A “segmentation fitting” maximum likelihood (ML) method is suggested to estimate the structural parameter of the logistic map along with the initial value x1 considered as an additional unknown parameter. The segmentation fitting method, called “piece-wise” ML, is similar in spirit but simpler and has smaller bias than the “multiple shooting” previously proposed. Comparisons with different previously proposed techniques on simulated numerical examples give favorable results (at least, for the investigated combinations of sample size N and noise level). Besides, unlike some suggested techniques, our method does not require the a priori knowledge of the noise variance. We also clarify the nature of the inherent difficulties in the statistical analysis of deterministically chaotic time series and the status of previously proposed Bayesian approaches. We note the trade off between the need of using a large number of data points in the ML analysis to decrease the bias (to guarantee consistency of the estimation) and the unstable nature of dynamical trajectories with exponentially fast loss of memory of the initial condition. The method of statistical moments for the estimation of the parameter of the logistic map is discussed. This method seems to be the unique method whose consistency for deterministically chaotic time series is proved so far theoretically (not only numerically).

  2. Statistical methods with applications to demography and life insurance

    CERN Document Server

    Khmaladze, Estáte V

    2013-01-01

    Suitable for statisticians, mathematicians, actuaries, and students interested in the problems of insurance and analysis of lifetimes, Statistical Methods with Applications to Demography and Life Insurance presents contemporary statistical techniques for analyzing life distributions and life insurance problems. It not only contains traditional material but also incorporates new problems and techniques not discussed in existing actuarial literature. The book mainly focuses on the analysis of an individual life and describes statistical methods based on empirical and related processes. Coverage ranges from analyzing the tails of distributions of lifetimes to modeling population dynamics with migrations. To help readers understand the technical points, the text covers topics such as the Stieltjes, Wiener, and Itô integrals. It also introduces other themes of interest in demography, including mixtures of distributions, analysis of longevity and extreme value theory, and the age structure of a population. In addi...

  3. Adaptive nonparametric Bayesian inference using location-scale mixture priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2010-01-01

    We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if

  4. Landslide Susceptibility Statistical Methods: A Critical and Systematic Literature Review

    Science.gov (United States)

    Mihir, Monika; Malamud, Bruce; Rossi, Mauro; Reichenbach, Paola; Ardizzone, Francesca

    2014-05-01

    Landslide susceptibility assessment, the subject of this systematic review, is aimed at understanding the spatial probability of slope failures under a set of geomorphological and environmental conditions. It is estimated that about 375 landslides that occur globally each year are fatal, with around 4600 people killed per year. Past studies have brought out the increasing cost of landslide damages which primarily can be attributed to human occupation and increased human activities in the vulnerable environments. Many scientists, to evaluate and reduce landslide risk, have made an effort to efficiently map landslide susceptibility using different statistical methods. In this paper, we do a critical and systematic landslide susceptibility literature review, in terms of the different statistical methods used. For each of a broad set of studies reviewed we note: (i) study geography region and areal extent, (ii) landslide types, (iii) inventory type and temporal period covered, (iv) mapping technique (v) thematic variables used (vi) statistical models, (vii) assessment of model skill, (viii) uncertainty assessment methods, (ix) validation methods. We then pulled out broad trends within our review of landslide susceptibility, particularly regarding the statistical methods. We found that the most common statistical methods used in the study of landslide susceptibility include logistic regression, artificial neural network, discriminant analysis and weight of evidence. Although most of the studies we reviewed assessed the model skill, very few assessed model uncertainty. In terms of geographic extent, the largest number of landslide susceptibility zonations were in Turkey, Korea, Spain, Italy and Malaysia. However, there are also many landslides and fatalities in other localities, particularly India, China, Philippines, Nepal and Indonesia, Guatemala, and Pakistan, where there are much fewer landslide susceptibility studies available in the peer-review literature. This

  5. The application of statistical methods to assess economic assets

    Directory of Open Access Journals (Sweden)

    D. V. Dianov

    2017-01-01

    Full Text Available The article is devoted to consideration and evaluation of machinery, equipment and special equipment, methodological aspects of the use of standards for assessment of buildings and structures in current prices, the valuation of residential, specialized houses, office premises, assessment and reassessment of existing and inactive military assets, the application of statistical methods to obtain the relevant cost estimates.The objective of the scientific article is to consider possible application of statistical tools in the valuation of the assets, composing the core group of elements of national wealth – the fixed assets. Firstly, capital tangible assets constitute the basis of material base of a new value creation, products and non-financial services. The gain, accumulated of tangible assets of a capital nature is a part of the gross domestic product, and from its volume and specific weight in the composition of GDP we can judge the scope of reproductive processes in the country.Based on the methodological materials of the state statistics bodies of the Russian Federation, regulations of the theory of statistics, which describe the methods of statistical analysis such as the index, average values, regression, the methodical approach is structured in the application of statistical tools to obtain value estimates of property, plant and equipment with significant accumulated depreciation. Until now, the use of statistical methodology in the practice of economic assessment of assets is only fragmentary. This applies to both Federal Legislation (Federal law № 135 «On valuation activities in the Russian Federation» dated 16.07.1998 in edition 05.07.2016 and the methodological documents and regulations of the estimated activities, in particular, the valuation activities’ standards. A particular problem is the use of a digital database of Rosstat (Federal State Statistics Service, as to the specific fixed assets the comparison should be carried

  6. Experiential Approach to Teaching Statistics and Research Methods ...

    African Journals Online (AJOL)

    Statistics and research methods are among the more demanding topics for students of education to master at both the undergraduate and postgraduate levels. It is our conviction that teaching these topics should be combined with real practical experiences. We discuss an experiential teaching/ learning approach that ...

  7. Application of statistical methods at copper wire manufacturing

    Directory of Open Access Journals (Sweden)

    Z. Hajduová

    2009-01-01

    Full Text Available Six Sigma is a method of management that strives for near perfection. The Six Sigma methodology uses data and rigorous statistical analysis to identify defects in a process or product, reduce variability and achieve as close to zero defects as possible. The paper presents the basic information on this methodology.

  8. Statistical and Machine Learning forecasting methods: Concerns and ways forward.

    Science.gov (United States)

    Makridakis, Spyros; Spiliotis, Evangelos; Assimakopoulos, Vassilios

    2018-01-01

    Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions.

  9. Computerized statistical analysis with bootstrap method in nuclear medicine

    International Nuclear Information System (INIS)

    Zoccarato, O.; Sardina, M.; Zatta, G.; De Agostini, A.; Barbesti, S.; Mana, O.; Tarolo, G.L.

    1988-01-01

    Statistical analysis of data samples involves some hypothesis about the features of data themselves. The accuracy of these hypotheses can influence the results of statistical inference. Among the new methods of computer-aided statistical analysis, the bootstrap method appears to be one of the most powerful, thanks to its ability to reproduce many artificial samples starting from a single original sample and because it works without hypothesis about data distribution. The authors applied the bootstrap method to two typical situation of Nuclear Medicine Department. The determination of the normal range of serum ferritin, as assessed by radioimmunoassay and defined by the mean value ±2 standard deviations, starting from an experimental sample of small dimension, shows an unacceptable lower limit (ferritin plasmatic levels below zero). On the contrary, the results obtained by elaborating 5000 bootstrap samples gives ans interval of values (10.95 ng/ml - 72.87 ng/ml) corresponding to the normal ranges commonly reported. Moreover the authors applied the bootstrap method in evaluating the possible error associated with the correlation coefficient determined between left ventricular ejection fraction (LVEF) values obtained by first pass radionuclide angiocardiography with 99m Tc and 195m Au. The results obtained indicate a high degree of statistical correlation and give the range of r 2 values to be considered acceptable for this type of studies

  10. Statistical and Machine Learning forecasting methods: Concerns and ways forward

    Science.gov (United States)

    Makridakis, Spyros; Assimakopoulos, Vassilios

    2018-01-01

    Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784

  11. Illinois' Forests, 2005: Statistics, Methods, and Quality Assurance

    Science.gov (United States)

    Susan J. Crocker; Charles J. Barnett; Mark A. Hatfield

    2013-01-01

    The first full annual inventory of Illinois' forests was completed in 2005. This report contains 1) descriptive information on methods, statistics, and quality assurance of data collection, 2) a glossary of terms, 3) tables that summarize quality assurance, and 4) a core set of tabular estimates for a variety of forest resources. A detailed analysis of inventory...

  12. Kansas's forests, 2005: statistics, methods, and quality assurance

    Science.gov (United States)

    Patrick D. Miles; W. Keith Moser; Charles J. Barnett

    2011-01-01

    The first full annual inventory of Kansas's forests was completed in 2005 after 8,868 plots were selected and 468 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of Kansas inventory is presented...

  13. South Dakota's forests, 2005: statistics, methods, and quality assurance

    Science.gov (United States)

    Patrick D. Miles; Ronald J. Piva; Charles J. Barnett

    2011-01-01

    The first full annual inventory of South Dakota's forests was completed in 2005 after 8,302 plots were selected and 325 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the South Dakota...

  14. Nebraska's forests, 2005: statistics, methods, and quality assurance

    Science.gov (United States)

    Patrick D. Miles; Dacia M. Meneguzzo; Charles J. Barnett

    2011-01-01

    The first full annual inventory of Nebraska's forests was completed in 2005 after 8,335 plots were selected and 274 forested plots were visited and measured. This report includes detailed information on forest inventory methods, and data quality estimates. Tables of various important resource statistics are presented. Detailed analysis of the inventory data are...

  15. North Dakota's forests, 2005: statistics, methods, and quality assurance

    Science.gov (United States)

    Patrick D. Miles; David E. Haugen; Charles J. Barnett

    2011-01-01

    The first full annual inventory of North Dakota's forests was completed in 2005 after 7,622 plots were selected and 164 forested plots were visited and measured. This report includes detailed information on forest inventory methods and data quality estimates. Important resource statistics are included in the tables. A detailed analysis of the North Dakota...

  16. Peer-Assisted Learning in Research Methods and Statistics

    Science.gov (United States)

    Stone, Anna; Meade, Claire; Watling, Rosamond

    2012-01-01

    Feedback from students on a Level 1 Research Methods and Statistics module, studied as a core part of a BSc Psychology programme, highlighted demand for additional tutorials to help them to understand basic concepts. Students in their final year of study commonly request work experience to enhance their employability. All students on the Level 1…

  17. A statistical method for 2D facial landmarking

    NARCIS (Netherlands)

    Dibeklioğlu, H.; Salah, A.A.; Gevers, T.

    2012-01-01

    Many facial-analysis approaches rely on robust and accurate automatic facial landmarking to correctly function. In this paper, we describe a statistical method for automatic facial-landmark localization. Our landmarking relies on a parsimonious mixture model of Gabor wavelet features, computed in

  18. Investigating salt frost scaling by using statistical methods

    DEFF Research Database (Denmark)

    Hasholt, Marianne Tange; Clemmensen, Line Katrine Harder

    2010-01-01

    A large data set comprising data for 118 concrete mixes on mix design, air void structure, and the outcome of freeze/thaw testing according to SS 13 72 44 has been analysed by use of statistical methods. The results show that with regard to mix composition, the most important parameter...

  19. A Bayesian statistical method for particle identification in shower counters

    International Nuclear Information System (INIS)

    Takashimizu, N.; Kimura, A.; Shibata, A.; Sasaki, T.

    2004-01-01

    We report an attempt on identifying particles using a Bayesian statistical method. We have developed the mathematical model and software for this purpose. We tried to identify electrons and charged pions in shower counters using this method. We designed an ideal shower counter and studied the efficiency of identification using Monte Carlo simulation based on Geant4. Without having any other information, e.g. charges of particles which are given by tracking detectors, we have achieved 95% identifications of both particles

  20. Quantum statistical Monte Carlo methods and applications to spin systems

    International Nuclear Information System (INIS)

    Suzuki, M.

    1986-01-01

    A short review is given concerning the quantum statistical Monte Carlo method based on the equivalence theorem that d-dimensional quantum systems are mapped onto (d+1)-dimensional classical systems. The convergence property of this approximate tansformation is discussed in detail. Some applications of this general appoach to quantum spin systems are reviewed. A new Monte Carlo method, ''thermo field Monte Carlo method,'' is presented, which is an extension of the projection Monte Carlo method at zero temperature to that at finite temperatures

  1. Thermodynamics, Gibbs Method and Statistical Physics of Electron Gases Gibbs Method and Statistical Physics of Electron Gases

    CERN Document Server

    Askerov, Bahram M

    2010-01-01

    This book deals with theoretical thermodynamics and the statistical physics of electron and particle gases. While treating the laws of thermodynamics from both classical and quantum theoretical viewpoints, it posits that the basis of the statistical theory of macroscopic properties of a system is the microcanonical distribution of isolated systems, from which all canonical distributions stem. To calculate the free energy, the Gibbs method is applied to ideal and non-ideal gases, and also to a crystalline solid. Considerable attention is paid to the Fermi-Dirac and Bose-Einstein quantum statistics and its application to different quantum gases, and electron gas in both metals and semiconductors is considered in a nonequilibrium state. A separate chapter treats the statistical theory of thermodynamic properties of an electron gas in a quantizing magnetic field.

  2. Application of statistical method for FBR plant transient computation

    International Nuclear Information System (INIS)

    Kikuchi, Norihiro; Mochizuki, Hiroyasu

    2014-01-01

    Highlights: • A statistical method with a large trial number up to 10,000 is applied to the plant system analysis. • A turbine trip test conducted at the “Monju” reactor is selected as a plant transient. • A reduction method of trial numbers is discussed. • The result with reduced trial number can express the base regions of the computed distribution. -- Abstract: It is obvious that design tolerances, errors included in operation, and statistical errors in empirical correlations effect on the transient behavior. The purpose of the present study is to apply above mentioned statistical errors to a plant system computation in order to evaluate the statistical distribution contained in the transient evolution. A selected computation case is the turbine trip test conducted at 40% electric power of the prototype fast reactor “Monju”. All of the heat transport systems of “Monju” are modeled with the NETFLOW++ system code which has been validated using the plant transient tests of the experimental fast reactor Joyo, and “Monju”. The effects of parameters on upper plenum temperature are confirmed by sensitivity analyses, and dominant parameters are chosen. The statistical errors are applied to each computation deck by using a pseudorandom number and the Monte-Carlo method. The dSFMT (Double precision SIMD-oriented Fast Mersenne Twister) that is developed version of Mersenne Twister (MT), is adopted as the pseudorandom number generator. In the present study, uniform random numbers are generated by dSFMT, and these random numbers are transformed to the normal distribution by the Box–Muller method. Ten thousands of different computations are performed at once. In every computation case, the steady calculation is performed for 12,000 s, and transient calculation is performed for 4000 s. In the purpose of the present statistical computation, it is important that the base regions of distribution functions should be calculated precisely. A large number of

  3. Exact nonparametric confidence bands for the survivor function.

    Science.gov (United States)

    Matthews, David

    2013-10-12

    A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.

  4. A method of statistical analysis in the field of sports science when assumptions of parametric tests are not violated

    Directory of Open Access Journals (Sweden)

    Elżbieta Sandurska

    2016-12-01

    Full Text Available Introduction: Application of statistical software typically does not require extensive statistical knowledge, allowing to easily perform even complex analyses. Consequently, test selection criteria and important assumptions may be easily overlooked or given insufficient consideration. In such cases, the results may likely lead to wrong conclusions. Aim: To discuss issues related to assumption violations in the case of Student's t-test and one-way ANOVA, two parametric tests frequently used in the field of sports science, and to recommend solutions. Description of the state of knowledge: Student's t-test and ANOVA are parametric tests, and therefore some of the assumptions that need to be satisfied include normal distribution of the data and homogeneity of variances in groups. If the assumptions are violated, the original design of the test is impaired, and the test may then be compromised giving spurious results. A simple method to normalize the data and to stabilize the variance is to use transformations. If such approach fails, a good alternative to consider is a nonparametric test, such as Mann-Whitney, the Kruskal-Wallis or Wilcoxon signed-rank tests. Summary: Thorough verification of the parametric tests assumptions allows for correct selection of statistical tools, which is the basis of well-grounded statistical analysis. With a few simple rules, testing patterns in the data characteristic for the study of sports science comes down to a straightforward procedure.

  5. Statistical concepts a second course

    CERN Document Server

    Lomax, Richard G

    2012-01-01

    Statistical Concepts consists of the last 9 chapters of An Introduction to Statistical Concepts, 3rd ed. Designed for the second course in statistics, it is one of the few texts that focuses just on intermediate statistics. The book highlights how statistics work and what they mean to better prepare students to analyze their own data and interpret SPSS and research results. As such it offers more coverage of non-parametric procedures used when standard assumptions are violated since these methods are more frequently encountered when working with real data. Determining appropriate sample sizes

  6. Statistical methods for assessing agreement between continuous measurements

    DEFF Research Database (Denmark)

    Sokolowski, Ineta; Hansen, Rikke Pilegaard; Vedsted, Peter

    Background: Clinical research often involves study of agreement amongst observers. Agreement can be measured in different ways, and one can obtain quite different values depending on which method one uses. Objective: We review the approaches that have been discussed to assess the agreement between...... continuous measures and discuss their strengths and weaknesses. Different methods are illustrated using actual data from the `Delay in diagnosis of cancer in general practice´ project in Aarhus, Denmark. Subjects and Methods: We use weighted kappa-statistic, intraclass correlation coefficient (ICC......), concordance coefficient, Bland-Altman limits of agreement and percentage of agreement to assess the agreement between patient reported delay and doctor reported delay in diagnosis of cancer in general practice. Key messages: The correct statistical approach is not obvious. Many studies give the product...

  7. Statistical disclosure control for microdata methods and applications in R

    CERN Document Server

    Templ, Matthias

    2017-01-01

    This book on statistical disclosure control presents the theory, applications and software implementation of the traditional approach to (micro)data anonymization, including data perturbation methods, disclosure risk, data utility, information loss and methods for simulating synthetic data. Introducing readers to the R packages sdcMicro and simPop, the book also features numerous examples and exercises with solutions, as well as case studies with real-world data, accompanied by the underlying R code to allow readers to reproduce all results. The demand for and volume of data from surveys, registers or other sources containing sensible information on persons or enterprises have increased significantly over the last several years. At the same time, privacy protection principles and regulations have imposed restrictions on the access and use of individual data. Proper and secure microdata dissemination calls for the application of statistical disclosure control methods to the data before release. This book is in...

  8. Applied statistical methods in agriculture, health and life sciences

    CERN Document Server

    Lawal, Bayo

    2014-01-01

    This textbook teaches crucial statistical methods to answer research questions using a unique range of statistical software programs, including MINITAB and R. This textbook is developed for undergraduate students in agriculture, nursing, biology and biomedical research. Graduate students will also find it to be a useful way to refresh their statistics skills and to reference software options. The unique combination of examples is approached using MINITAB and R for their individual strengths. Subjects covered include among others data description, probability distributions, experimental design, regression analysis, randomized design and biological assay. Unlike other biostatistics textbooks, this text also includes outliers, influential observations in regression and an introduction to survival analysis. Material is taken from the author's extensive teaching and research in Africa, USA and the UK. Sample problems, references and electronic supplementary material accompany each chapter.

  9. Identification of mine waters by statistical multivariate methods

    Energy Technology Data Exchange (ETDEWEB)

    Mali, N [IGGG, Ljubljana (Slovenia)

    1992-01-01

    Three water-bearing aquifers are present in the Velenje lignite mine. The aquifer waters have differing chemical composition; a geochemical water analysis can therefore determine the source of mine water influx. Mine water samples from different locations in the mine were analyzed, the results of chemical content and of electric conductivity of mine water were statistically processed by means of MICROGAS, SPSS-X and IN STATPAC computer programs, which apply three multivariate statistical methods (discriminate, cluster and factor analysis). Reliability of calculated values was determined with the Kolmogorov and Smirnov tests. It is concluded that laboratory analysis of single water samples can produce measurement errors, but statistical processing of water sample data can identify origin and movement of mine water. 15 refs.

  10. Nonequilibrium Statistical Operator Method and Generalized Kinetic Equations

    Science.gov (United States)

    Kuzemsky, A. L.

    2018-01-01

    We consider some principal problems of nonequilibrium statistical thermodynamics in the framework of the Zubarev nonequilibrium statistical operator approach. We present a brief comparative analysis of some approaches to describing irreversible processes based on the concept of nonequilibrium Gibbs ensembles and their applicability to describing nonequilibrium processes. We discuss the derivation of generalized kinetic equations for a system in a heat bath. We obtain and analyze a damped Schrödinger-type equation for a dynamical system in a heat bath. We study the dynamical behavior of a particle in a medium taking the dissipation effects into account. We consider the scattering problem for neutrons in a nonequilibrium medium and derive a generalized Van Hove formula. We show that the nonequilibrium statistical operator method is an effective, convenient tool for describing irreversible processes in condensed matter.

  11. Non-parametric smoothing of experimental data

    International Nuclear Information System (INIS)

    Kuketayev, A.T.; Pen'kov, F.M.

    2007-01-01

    Full text: Rapid processing of experimental data samples in nuclear physics often requires differentiation in order to find extrema. Therefore, even at the preliminary stage of data analysis, a range of noise reduction methods are used to smooth experimental data. There are many non-parametric smoothing techniques: interval averages, moving averages, exponential smoothing, etc. Nevertheless, it is more common to use a priori information about the behavior of the experimental curve in order to construct smoothing schemes based on the least squares techniques. The latter methodology's advantage is that the area under the curve can be preserved, which is equivalent to conservation of total speed of counting. The disadvantages of this approach include the lack of a priori information. For example, very often the sums of undifferentiated (by a detector) peaks are replaced with one peak during the processing of data, introducing uncontrolled errors in the determination of the physical quantities. The problem is solvable only by having experienced personnel, whose skills are much greater than the challenge. We propose a set of non-parametric techniques, which allows the use of any additional information on the nature of experimental dependence. The method is based on a construction of a functional, which includes both experimental data and a priori information. Minimum of this functional is reached on a non-parametric smoothed curve. Euler (Lagrange) differential equations are constructed for these curves; then their solutions are obtained analytically or numerically. The proposed approach allows for automated processing of nuclear physics data, eliminating the need for highly skilled laboratory personnel. Pursuant to the proposed approach is the possibility to obtain smoothing curves in a given confidence interval, e.g. according to the χ 2 distribution. This approach is applicable when constructing smooth solutions of ill-posed problems, in particular when solving

  12. Convergence in energy consumption per capita across the US states, 1970–2013: An exploration through selected parametric and non-parametric methods

    International Nuclear Information System (INIS)

    Mohammadi, Hassan; Ram, Rati

    2017-01-01

    Noting the paucity of studies of convergence in energy consumption across the US states, and the usefulness of a study that shares the spirit of the enormous research on convergence in energy-related variables in cross-country contexts, this paper explores convergence in per-capita energy consumption across the US states over the 44-year period 1970–2013. Several well-known parametric and non-parametric approaches are explored partly to shed light on the substantive question and partly to provide a comparative methodological perspective on these approaches. Several statements summarize the outcome of our explorations. First, the widely-used Barro-type regressions do not indicate beta-convergence during the entire period or any of several sub-periods. Second, lack of sigma-convergence is also noted in terms of standard deviation of logarithms and coefficient of variation which do not show a decline between 1970 and 2013, but show slight upward trends. Third, kernel density function plots indicate some flattening of the distribution which is consistent with the results from sigma-convergence scenario. Fourth, intra-distribution mobility (“gamma convergence”) in terms of an index of rank concordance suggests a slow decline in the index. Fifth, the general impression from several types of panel and time-series unit-root tests is that of non-stationarity of the series and thus the lack of stochastic convergence during the period. Sixth, therefore, the overall impression seems to be that of the lack of convergence across states in per-capita energy consumption. The present interstate inequality in per-capita energy consumption may, therefore, reflect variations in structural factors and might not be expected to diminish.

  13. Identifying Reflectors in Seismic Images via Statistic and Syntactic Methods

    Directory of Open Access Journals (Sweden)

    Carlos A. Perez

    2010-04-01

    Full Text Available In geologic interpretation of seismic reflection data, accurate identification of reflectors is the foremost step to ensure proper subsurface structural definition. Reflector information, along with other data sets, is a key factor to predict the presence of hydrocarbons. In this work, mathematic and pattern recognition theory was adapted to design two statistical and two syntactic algorithms which constitute a tool in semiautomatic reflector identification. The interpretive power of these four schemes was evaluated in terms of prediction accuracy and computational speed. Among these, the semblance method was confirmed to render the greatest accuracy and speed. Syntactic methods offer an interesting alternative due to their inherently structural search method.

  14. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin; Tong, Tiejun; Zhu, Lixing

    2017-01-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  15. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin

    2017-09-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  16. New Graphical Methods and Test Statistics for Testing Composite Normality

    Directory of Open Access Journals (Sweden)

    Marc S. Paolella

    2015-07-01

    Full Text Available Several graphical methods for testing univariate composite normality from an i.i.d. sample are presented. They are endowed with correct simultaneous error bounds and yield size-correct tests. As all are based on the empirical CDF, they are also consistent for all alternatives. For one test, called the modified stabilized probability test, or MSP, a highly simplified computational method is derived, which delivers the test statistic and also a highly accurate p-value approximation, essentially instantaneously. The MSP test is demonstrated to have higher power against asymmetric alternatives than the well-known and powerful Jarque-Bera test. A further size-correct test, based on combining two test statistics, is shown to have yet higher power. The methodology employed is fully general and can be applied to any i.i.d. univariate continuous distribution setting.

  17. Statistical learning modeling method for space debris photometric measurement

    Science.gov (United States)

    Sun, Wenjing; Sun, Jinqiu; Zhang, Yanning; Li, Haisen

    2016-03-01

    Photometric measurement is an important way to identify the space debris, but the present methods of photometric measurement have many constraints on star image and need complex image processing. Aiming at the problems, a statistical learning modeling method for space debris photometric measurement is proposed based on the global consistency of the star image, and the statistical information of star images is used to eliminate the measurement noises. First, the known stars on the star image are divided into training stars and testing stars. Then, the training stars are selected as the least squares fitting parameters to construct the photometric measurement model, and the testing stars are used to calculate the measurement accuracy of the photometric measurement model. Experimental results show that, the accuracy of the proposed photometric measurement model is about 0.1 magnitudes.

  18. Statistic method of research reactors maximum permissible power calculation

    International Nuclear Information System (INIS)

    Grosheva, N.A.; Kirsanov, G.A.; Konoplev, K.A.; Chmshkyan, D.V.

    1998-01-01

    The technique for calculating maximum permissible power of a research reactor at which the probability of the thermal-process accident does not exceed the specified value, is presented. The statistical method is used for the calculations. It is regarded that the determining function related to the reactor safety is the known function of the reactor power and many statistically independent values which list includes the reactor process parameters, geometrical characteristics of the reactor core and fuel elements, as well as random factors connected with the reactor specific features. Heat flux density or temperature is taken as a limiting factor. The program realization of the method discussed is briefly described. The results of calculating the PIK reactor margin coefficients for different probabilities of the thermal-process accident are considered as an example. It is shown that the probability of an accident with fuel element melting in hot zone is lower than 10 -8 1 per year for the reactor rated power [ru

  19. Applied systems ecology: models, data, and statistical methods

    Energy Technology Data Exchange (ETDEWEB)

    Eberhardt, L L

    1976-01-01

    In this report, systems ecology is largely equated to mathematical or computer simulation modelling. The need for models in ecology stems from the necessity to have an integrative device for the diversity of ecological data, much of which is observational, rather than experimental, as well as from the present lack of a theoretical structure for ecology. Different objectives in applied studies require specialized methods. The best predictive devices may be regression equations, often non-linear in form, extracted from much more detailed models. A variety of statistical aspects of modelling, including sampling, are discussed. Several aspects of population dynamics and food-chain kinetics are described, and it is suggested that the two presently separated approaches should be combined into a single theoretical framework. It is concluded that future efforts in systems ecology should emphasize actual data and statistical methods, as well as modelling.

  20. Multivariate methods and forecasting with IBM SPSS statistics

    CERN Document Server

    Aljandali, Abdulkader

    2017-01-01

    This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...

  1. Mathematical and Statistical Methods for Actuarial Sciences and Finance

    CERN Document Server

    Legros, Florence; Perna, Cira; Sibillo, Marilena

    2017-01-01

    This volume gathers selected peer-reviewed papers presented at the international conference "MAF 2016 – Mathematical and Statistical Methods for Actuarial Sciences and Finance”, held in Paris (France) at the Université Paris-Dauphine from March 30 to April 1, 2016. The contributions highlight new ideas on mathematical and statistical methods in actuarial sciences and finance. The cooperation between mathematicians and statisticians working in insurance and finance is a very fruitful field, one that yields unique  theoretical models and practical applications, as well as new insights in the discussion of problems of national and international interest. This volume is addressed to academicians, researchers, Ph.D. students and professionals.

  2. Exact nonparametric inference for detection of nonlinear determinism

    OpenAIRE

    Luo, Xiaodong; Zhang, Jie; Small, Michael; Moroz, Irene

    2005-01-01

    We propose an exact nonparametric inference scheme for the detection of nonlinear determinism. The essential fact utilized in our scheme is that, for a linear stochastic process with jointly symmetric innovations, its ordinary least square (OLS) linear prediction error is symmetric about zero. Based on this viewpoint, a class of linear signed rank statistics, e.g. the Wilcoxon signed rank statistic, can be derived with the known null distributions from the prediction error. Thus one of the ad...

  3. Statistical methods for longitudinal data with agricultural applications

    DEFF Research Database (Denmark)

    Anantharama Ankinakatte, Smitha

    The PhD study focuses on modeling two kings of longitudinal data arising in agricultural applications: continuous time series data and discrete longitudinal data. Firstly, two statistical methods, neural networks and generalized additive models, are applied to predict masistis using multivariate...... algorithm. This was found to compare favourably with the algorithm implemented in the well-known Beagle software. Finally, an R package to apply APFA models developed as part of the PhD project is described...

  4. Statistical methods in nuclear material accountancy: Past, present and future

    International Nuclear Information System (INIS)

    Pike, D.J.; Woods, A.J.

    1983-01-01

    The analysis of nuclear material inventory data is motivated by the desire to detect any loss or diversion of nuclear material, insofar as such detection may be feasible by statistical analysis of repeated inventory and throughput measurements. The early regulations, which laid down the specifications for the analysis of inventory data, were framed without acknowledging the essentially sequential nature of the data. It is the broad aim of this paper to discuss the historical nature of statistical analysis of inventory data including an evaluation of why statistical methods should be required at all. If it is accepted that statistical techniques are required, then two main areas require extensive discussion. First, it is important to assess the extent to which stated safeguards aims can be met in practice. Second, there is a vital need for reassessment of the statistical techniques which have been proposed for use in nuclear material accountancy. Part of this reassessment must involve a reconciliation of the apparent differences in philosophy shown by statisticians; but, in addition, the techniques themselves need comparative study to see to what extent they are capable of meeting realistic safeguards aims. This paper contains a brief review of techniques with an attempt to compare and contrast the approaches. It will be suggested that much current research is following closely similar lines, and that national and international bodies should encourage collaborative research and practical in-plant implementations. The techniques proposed require credibility and power; but at this point in time statisticians require credibility and a greater level of unanimity in their approach. A way ahead is proposed based on a clear specification of realistic safeguards aims, and a development of a unified statistical approach with encouragement for the performance of joint research. (author)

  5. State analysis of BOP using statistical and heuristic methods

    International Nuclear Information System (INIS)

    Heo, Gyun Young; Chang, Soon Heung

    2003-01-01

    Under the deregulation environment, the performance enhancement of BOP in nuclear power plants is being highlighted. To analyze performance level of BOP, we use the performance test procedures provided from an authorized institution such as ASME. However, through plant investigation, it was proved that the requirements of the performance test procedures about the reliability and quantity of sensors was difficult to be satisfied. As a solution of this, state analysis method that are the expanded concept of signal validation, was proposed on the basis of the statistical and heuristic approaches. Authors recommended the statistical linear regression model by analyzing correlation among BOP parameters as a reference state analysis method. Its advantage is that its derivation is not heuristic, it is possible to calculate model uncertainty, and it is easy to apply to an actual plant. The error of the statistical linear regression model is below 3% under normal as well as abnormal system states. Additionally a neural network model was recommended since the statistical model is impossible to apply to the validation of all of the sensors and is sensitive to the outlier that is the signal located out of a statistical distribution. Because there are a lot of sensors need to be validated in BOP, wavelet analysis (WA) were applied as a pre-processor for the reduction of input dimension and for the enhancement of training accuracy. The outlier localization capability of WA enhanced the robustness of the neural network. The trained neural network restored the degraded signals to the values within ±3% of the true signals

  6. Nonparametric predictive pairwise comparison with competing risks

    International Nuclear Information System (INIS)

    Coolen-Maturi, Tahani

    2014-01-01

    In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed

  7. Statistical benchmarking in utility regulation: Role, standards and methods

    International Nuclear Information System (INIS)

    Newton Lowry, Mark; Getachew, Lullit

    2009-01-01

    Statistical benchmarking is being used with increasing frequency around the world in utility rate regulation. We discuss how and where benchmarking is in use for this purpose and the pros and cons of regulatory benchmarking. We then discuss alternative performance standards and benchmarking methods in regulatory applications. We use these to propose guidelines for the appropriate use of benchmarking in the rate setting process. The standards, which we term the competitive market and frontier paradigms, have a bearing on method selection. These along with regulatory experience suggest that benchmarking can either be used for prudence review in regulation or to establish rates or rate setting mechanisms directly

  8. Statistical methods of spin assignment in compound nuclear reactions

    International Nuclear Information System (INIS)

    Mach, H.; Johns, M.W.

    1984-01-01

    Spin assignment to nuclear levels can be obtained from standard in-beam gamma-ray spectroscopy techniques and in the case of compound nuclear reactions can be complemented by statistical methods. These are based on a correlation pattern between level spin and gamma-ray intensities feeding low-lying levels. Three types of intensity and level spin correlations are found suitable for spin assignment: shapes of the excitation functions, ratio of intensity at two beam energies or populated in two different reactions, and feeding distributions. Various empirical attempts are examined and the range of applicability of these methods as well as the limitations associated with them are given. 12 references

  9. Statistical methods of spin assignment in compound nuclear reactions

    International Nuclear Information System (INIS)

    Mach, H.; Johns, M.W.

    1985-01-01

    Spin assignment to nuclear levels can be obtained from standard in-beam gamma-ray spectroscopy techniques and in the case of compound nuclear reactions can be complemented by statistical methods. These are based on a correlation pattern between level spin and gamma-ray intensities feeding low-lying levels. Three types of intensity and level spin correlations are found suitable for spin assignment: shapes of the excitation functions, ratio of intensity at two beam energies or populated in two different reactions, and feeding distributions. Various empirical attempts are examined and the range of applicability of these methods as well as the limitations associated with them are given

  10. On second quantization methods applied to classical statistical mechanics

    International Nuclear Information System (INIS)

    Matos Neto, A.; Vianna, J.D.M.

    1984-01-01

    A method of expressing statistical classical results in terms of mathematical entities usually associated to quantum field theoretical treatment of many particle systems (Fock space, commutators, field operators, state vector) is discussed. It is developed a linear response theory using the 'second quantized' Liouville equation introduced by Schonberg. The relationship of this method to that of Prigogine et al. is briefly analyzed. The chain of equations and the spectral representations for the new classical Green's functions are presented. Generalized operators defined on Fock space are discussed. It is shown that the correlation functions can be obtained from Green's functions defined with generalized operators. (Author) [pt

  11. Mathematical statistics

    CERN Document Server

    Pestman, Wiebe R

    2009-01-01

    This textbook provides a broad and solid introduction to mathematical statistics, including the classical subjects hypothesis testing, normal regression analysis, and normal analysis of variance. In addition, non-parametric statistics and vectorial statistics are considered, as well as applications of stochastic analysis in modern statistics, e.g., Kolmogorov-Smirnov testing, smoothing techniques, robustness and density estimation. For students with some elementary mathematical background. With many exercises. Prerequisites from measure theory and linear algebra are presented.

  12. Methods for estimating low-flow statistics for Massachusetts streams

    Science.gov (United States)

    Ries, Kernell G.; Friesz, Paul J.

    2000-01-01

    Methods and computer software are described in this report for determining flow duration, low-flow frequency statistics, and August median flows. These low-flow statistics can be estimated for unregulated streams in Massachusetts using different methods depending on whether the location of interest is at a streamgaging station, a low-flow partial-record station, or an ungaged site where no data are available. Low-flow statistics for streamgaging stations can be estimated using standard U.S. Geological Survey methods described in the report. The MOVE.1 mathematical method and a graphical correlation method can be used to estimate low-flow statistics for low-flow partial-record stations. The MOVE.1 method is recommended when the relation between measured flows at a partial-record station and daily mean flows at a nearby, hydrologically similar streamgaging station is linear, and the graphical method is recommended when the relation is curved. Equations are presented for computing the variance and equivalent years of record for estimates of low-flow statistics for low-flow partial-record stations when either a single or multiple index stations are used to determine the estimates. The drainage-area ratio method or regression equations can be used to estimate low-flow statistics for ungaged sites where no data are available. The drainage-area ratio method is generally as accurate as or more accurate than regression estimates when the drainage-area ratio for an ungaged site is between 0.3 and 1.5 times the drainage area of the index data-collection site. Regression equations were developed to estimate the natural, long-term 99-, 98-, 95-, 90-, 85-, 80-, 75-, 70-, 60-, and 50-percent duration flows; the 7-day, 2-year and the 7-day, 10-year low flows; and the August median flow for ungaged sites in Massachusetts. Streamflow statistics and basin characteristics for 87 to 133 streamgaging stations and low-flow partial-record stations were used to develop the equations. The

  13. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  14. Literature in Focus: Statistical Methods in Experimental Physics

    CERN Multimedia

    2007-01-01

    Frederick James was a high-energy physicist who became the CERN "expert" on statistics and is now well-known around the world, in part for this famous text. The first edition of Statistical Methods in Experimental Physics was originally co-written with four other authors and was published in 1971 by North Holland (now an imprint of Elsevier). It became such an important text that demand for it has continued for more than 30 years. Fred has updated it and it was released in a second edition by World Scientific in 2006. It is still a top seller and there is no exaggeration in calling it «the» reference on the subject. A full review of the title appeared in the October CERN Courier.Come and meet the author to hear more about how this book has flourished during its 35-year lifetime. Frederick James Statistical Methods in Experimental Physics Monday, 26th of November, 4 p.m. Council Chamber (Bldg. 503-1-001) The author will be introduced...

  15. Fuel rod design by statistical methods for MOX fuel

    International Nuclear Information System (INIS)

    Heins, L.; Landskron, H.

    2000-01-01

    Statistical methods in fuel rod design have received more and more attention during the last years. One of different possible ways to use statistical methods in fuel rod design can be described as follows: Monte Carlo calculations are performed using the fuel rod code CARO. For each run with CARO, the set of input data is modified: parameters describing the design of the fuel rod (geometrical data, density etc.) and modeling parameters are randomly selected according to their individual distributions. Power histories are varied systematically in a way that each power history of the relevant core management calculation is represented in the Monte Carlo calculations with equal frequency. The frequency distributions of the results as rod internal pressure and cladding strain which are generated by the Monte Carlo calculation are evaluated and compared with the design criteria. Up to now, this methodology has been applied to licensing calculations for PWRs and BWRs, UO 2 and MOX fuel, in 3 countries. Especially for the insertion of MOX fuel resulting in power histories with relatively high linear heat generation rates at higher burnup, the statistical methodology is an appropriate approach to demonstrate the compliance of licensing requirements. (author)

  16. Heterogeneous Rock Simulation Using DIP-Micromechanics-Statistical Methods

    Directory of Open Access Journals (Sweden)

    H. Molladavoodi

    2018-01-01

    Full Text Available Rock as a natural material is heterogeneous. Rock material consists of minerals, crystals, cement, grains, and microcracks. Each component of rock has a different mechanical behavior under applied loading condition. Therefore, rock component distribution has an important effect on rock mechanical behavior, especially in the postpeak region. In this paper, the rock sample was studied by digital image processing (DIP, micromechanics, and statistical methods. Using image processing, volume fractions of the rock minerals composing the rock sample were evaluated precisely. The mechanical properties of the rock matrix were determined based on upscaling micromechanics. In order to consider the rock heterogeneities effect on mechanical behavior, the heterogeneity index was calculated in a framework of statistical method. A Weibull distribution function was fitted to the Young modulus distribution of minerals. Finally, statistical and Mohr–Coulomb strain-softening models were used simultaneously as a constitutive model in DEM code. The acoustic emission, strain energy release, and the effect of rock heterogeneities on the postpeak behavior process were investigated. The numerical results are in good agreement with experimental data.

  17. THE FLUORBOARD A STATISTICALLY BASED DASHBOARD METHOD FOR IMPROVING SAFETY

    International Nuclear Information System (INIS)

    PREVETTE, S.S.

    2005-01-01

    The FluorBoard is a statistically based dashboard method for improving safety. Fluor Hanford has achieved significant safety improvements--including more than a 80% reduction in OSHA cases per 200,000 hours, during its work at the US Department of Energy's Hanford Site in Washington state. The massive project on the former nuclear materials production site is considered one of the largest environmental cleanup projects in the world. Fluor Hanford's safety improvements were achieved by a committed partnering of workers, managers, and statistical methodology. Safety achievements at the site have been due to a systematic approach to safety. This includes excellent cooperation between the field workers, the safety professionals, and management through OSHA Voluntary Protection Program principles. Fluor corporate values are centered around safety, and safety excellence is important for every manager in every project. In addition, Fluor Hanford has utilized a rigorous approach to using its safety statistics, based upon Dr. Shewhart's control charts, and Dr. Deming's management and quality methods

  18. Statistical physics and computational methods for evolutionary game theory

    CERN Document Server

    Javarone, Marco Alberto

    2018-01-01

    This book presents an introduction to Evolutionary Game Theory (EGT) which is an emerging field in the area of complex systems attracting the attention of researchers from disparate scientific communities. EGT allows one to represent and study several complex phenomena, such as the emergence of cooperation in social systems, the role of conformity in shaping the equilibrium of a population, and the dynamics in biological and ecological systems. Since EGT models belong to the area of complex systems, statistical physics constitutes a fundamental ingredient for investigating their behavior. At the same time, the complexity of some EGT models, such as those realized by means of agent-based methods, often require the implementation of numerical simulations. Therefore, beyond providing an introduction to EGT, this book gives a brief overview of the main statistical physics tools (such as phase transitions and the Ising model) and computational strategies for simulating evolutionary games (such as Monte Carlo algor...

  19. Huffman and linear scanning methods with statistical language models.

    Science.gov (United States)

    Roark, Brian; Fried-Oken, Melanie; Gibbons, Chris

    2015-03-01

    Current scanning access methods for text generation in AAC devices are limited to relatively few options, most notably row/column variations within a matrix. We present Huffman scanning, a new method for applying statistical language models to binary-switch, static-grid typing AAC interfaces, and compare it to other scanning options under a variety of conditions. We present results for 16 adults without disabilities and one 36-year-old man with locked-in syndrome who presents with complex communication needs and uses AAC scanning devices for writing. Huffman scanning with a statistical language model yielded significant typing speedups for the 16 participants without disabilities versus any of the other methods tested, including two row/column scanning methods. A similar pattern of results was found with the individual with locked-in syndrome. Interestingly, faster typing speeds were obtained with Huffman scanning using a more leisurely scan rate than relatively fast individually calibrated scan rates. Overall, the results reported here demonstrate great promise for the usability of Huffman scanning as a faster alternative to row/column scanning.

  20. Statistical Method to Overcome Overfitting Issue in Rational Function Models

    Science.gov (United States)

    Alizadeh Moghaddam, S. H.; Mokhtarzade, M.; Alizadeh Naeini, A.; Alizadeh Moghaddam, S. A.

    2017-09-01

    Rational function models (RFMs) are known as one of the most appealing models which are extensively applied in geometric correction of satellite images and map production. Overfitting is a common issue, in the case of terrain dependent RFMs, that degrades the accuracy of RFMs-derived geospatial products. This issue, resulting from the high number of RFMs' parameters, leads to ill-posedness of the RFMs. To tackle this problem, in this study, a fast and robust statistical approach is proposed and compared to Tikhonov regularization (TR) method, as a frequently-used solution to RFMs' overfitting. In the proposed method, a statistical test, namely, significance test is applied to search for the RFMs' parameters that are resistant against overfitting issue. The performance of the proposed method was evaluated for two real data sets of Cartosat-1 satellite images. The obtained results demonstrate the efficiency of the proposed method in term of the achievable level of accuracy. This technique, indeed, shows an improvement of 50-80% over the TR.

  1. Radiological decontamination, survey, and statistical release method for vehicles

    International Nuclear Information System (INIS)

    Goodwill, M.E.; Lively, J.W.; Morris, R.L.

    1996-06-01

    Earth-moving vehicles (e.g., dump trucks, belly dumps) commonly haul radiologically contaminated materials from a site being remediated to a disposal site. Traditionally, each vehicle must be surveyed before being released. The logistical difficulties of implementing the traditional approach on a large scale demand that an alternative be devised. A statistical method for assessing product quality from a continuous process was adapted to the vehicle decontamination process. This method produced a sampling scheme that automatically compensates and accommodates fluctuating batch sizes and changing conditions without the need to modify or rectify the sampling scheme in the field. Vehicles are randomly selected (sampled) upon completion of the decontamination process to be surveyed for residual radioactive surface contamination. The frequency of sampling is based on the expected number of vehicles passing through the decontamination process in a given period and the confidence level desired. This process has been successfully used for 1 year at the former uranium millsite in Monticello, Utah (a cleanup site regulated under the Comprehensive Environmental Response, Compensation, and Liability Act). The method forces improvement in the quality of the decontamination process and results in a lower likelihood that vehicles exceeding the surface contamination standards are offered for survey. Implementation of this statistical sampling method on Monticello projects has resulted in more efficient processing of vehicles through decontamination and radiological release, saved hundreds of hours of processing time, provided a high level of confidence that release limits are met, and improved the radiological cleanliness of vehicles leaving the controlled site

  2. Statistics

    CERN Document Server

    Hayslett, H T

    1991-01-01

    Statistics covers the basic principles of Statistics. The book starts by tackling the importance and the two kinds of statistics; the presentation of sample data; the definition, illustration and explanation of several measures of location; and the measures of variation. The text then discusses elementary probability, the normal distribution and the normal approximation to the binomial. Testing of statistical hypotheses and tests of hypotheses about the theoretical proportion of successes in a binomial population and about the theoretical mean of a normal population are explained. The text the

  3. Mathematical and statistical methods for actuarial sciences and finance

    CERN Document Server

    Sibillo, Marilena

    2014-01-01

    The interaction between mathematicians and statisticians working in the actuarial and financial fields is producing numerous meaningful scientific results. This volume, comprising a series of four-page papers, gathers new ideas relating to mathematical and statistical methods in the actuarial sciences and finance. The book covers a variety of topics of interest from both theoretical and applied perspectives, including: actuarial models; alternative testing approaches; behavioral finance; clustering techniques; coherent and non-coherent risk measures; credit-scoring approaches; data envelopment analysis; dynamic stochastic programming; financial contagion models; financial ratios; intelligent financial trading systems; mixture normality approaches; Monte Carlo-based methodologies; multicriteria methods; nonlinear parameter estimation techniques; nonlinear threshold models; particle swarm optimization; performance measures; portfolio optimization; pricing methods for structured and non-structured derivatives; r...

  4. Evolutionary Computation Methods and their applications in Statistics

    Directory of Open Access Journals (Sweden)

    Francesco Battaglia

    2013-05-01

    Full Text Available A brief discussion of the genesis of evolutionary computation methods, their relationship to artificial intelligence, and the contribution of genetics and Darwin’s theory of natural evolution is provided. Then, the main evolutionary computation methods are illustrated: evolution strategies, genetic algorithms, estimation of distribution algorithms, differential evolution, and a brief description of some evolutionary behavior methods such as ant colony and particle swarm optimization. We also discuss the role of the genetic algorithm for multivariate probability distribution random generation, rather than as a function optimizer. Finally, some relevant applications of genetic algorithm to statistical problems are reviewed: selection of variables in regression, time series model building, outlier identification, cluster analysis, design of experiments.

  5. A ¤nonparametric dynamic additive regression model for longitudinal data

    DEFF Research Database (Denmark)

    Martinussen, T.; Scheike, T. H.

    2000-01-01

    dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...

  6. Hybrid perturbation methods based on statistical time series models

    Science.gov (United States)

    San-Juan, Juan Félix; San-Martín, Montserrat; Pérez, Iván; López, Rosario

    2016-04-01

    In this work we present a new methodology for orbit propagation, the hybrid perturbation theory, based on the combination of an integration method and a prediction technique. The former, which can be a numerical, analytical or semianalytical theory, generates an initial approximation that contains some inaccuracies derived from the fact that, in order to simplify the expressions and subsequent computations, not all the involved forces are taken into account and only low-order terms are considered, not to mention the fact that mathematical models of perturbations not always reproduce physical phenomena with absolute precision. The prediction technique, which can be based on either statistical time series models or computational intelligence methods, is aimed at modelling and reproducing missing dynamics in the previously integrated approximation. This combination results in the precision improvement of conventional numerical, analytical and semianalytical theories for determining the position and velocity of any artificial satellite or space debris object. In order to validate this methodology, we present a family of three hybrid orbit propagators formed by the combination of three different orders of approximation of an analytical theory and a statistical time series model, and analyse their capability to process the effect produced by the flattening of the Earth. The three considered analytical components are the integration of the Kepler problem, a first-order and a second-order analytical theories, whereas the prediction technique is the same in the three cases, namely an additive Holt-Winters method.

  7. Classification of Specialized Farms Applying Multivariate Statistical Methods

    Directory of Open Access Journals (Sweden)

    Zuzana Hloušková

    2017-01-01

    Full Text Available Classification of specialized farms applying multivariate statistical methods The paper is aimed at application of advanced multivariate statistical methods when classifying cattle breeding farming enterprises by their economic size. Advantage of the model is its ability to use a few selected indicators compared to the complex methodology of current classification model that requires knowledge of detailed structure of the herd turnover and structure of cultivated crops. Output of the paper is intended to be applied within farm structure research focused on future development of Czech agriculture. As data source, the farming enterprises database for 2014 has been used, from the FADN CZ system. The predictive model proposed exploits knowledge of actual size classes of the farms tested. Outcomes of the linear discriminatory analysis multifactor classification method have supported the chance of filing farming enterprises in the group of Small farms (98 % filed correctly, and the Large and Very Large enterprises (100 % filed correctly. The Medium Size farms have been correctly filed at 58.11 % only. Partial shortages of the process presented have been found when discriminating Medium and Small farms.

  8. Practical statistics in pain research.

    Science.gov (United States)

    Kim, Tae Kyun

    2017-10-01

    Pain is subjective, while statistics related to pain research are objective. This review was written to help researchers involved in pain research make statistical decisions. The main issues are related with the level of scales that are often used in pain research, the choice of statistical methods between parametric or nonparametric statistics, and problems which arise from repeated measurements. In the field of pain research, parametric statistics used to be applied in an erroneous way. This is closely related with the scales of data and repeated measurements. The level of scales includes nominal, ordinal, interval, and ratio scales. The level of scales affects the choice of statistics between parametric or non-parametric methods. In the field of pain research, the most frequently used pain assessment scale is the ordinal scale, which would include the visual analogue scale (VAS). There used to be another view, however, which considered the VAS to be an interval or ratio scale, so that the usage of parametric statistics would be accepted practically in some cases. Repeated measurements of the same subjects always complicates statistics. It means that measurements inevitably have correlations between each other, and would preclude the application of one-way ANOVA in which independence between the measurements is necessary. Repeated measures of ANOVA (RMANOVA), however, would permit the comparison between the correlated measurements as long as the condition of sphericity assumption is satisfied. Conclusively, parametric statistical methods should be used only when the assumptions of parametric statistics, such as normality and sphericity, are established.

  9. Bayesian statistic methods and theri application in probabilistic simulation models

    Directory of Open Access Journals (Sweden)

    Sergio Iannazzo

    2007-03-01

    Full Text Available Bayesian statistic methods are facing a rapidly growing level of interest and acceptance in the field of health economics. The reasons of this success are probably to be found on the theoretical fundaments of the discipline that make these techniques more appealing to decision analysis. To this point should be added the modern IT progress that has developed different flexible and powerful statistical software framework. Among them probably one of the most noticeably is the BUGS language project and its standalone application for MS Windows WinBUGS. Scope of this paper is to introduce the subject and to show some interesting applications of WinBUGS in developing complex economical models based on Markov chains. The advantages of this approach reside on the elegance of the code produced and in its capability to easily develop probabilistic simulations. Moreover an example of the integration of bayesian inference models in a Markov model is shown. This last feature let the analyst conduce statistical analyses on the available sources of evidence and exploit them directly as inputs in the economic model.

  10. Application of mathematical statistics methods to study fluorite deposits

    International Nuclear Information System (INIS)

    Chermeninov, V.B.

    1980-01-01

    Considered are the applicability of mathematical-statistical methods for the increase of reliability of sampling and geological tasks (study of regularities of ore formation). Compared is the reliability of core sampling (regarding the selective abrasion of fluorite) and neutron activation logging for fluorine. The core sampling data are characterized by higher dispersion than neutron activation logging results (mean value of variation coefficients are 75% and 56% respectively). However the hypothesis of the equality of average two sampling is confirmed; this fact testifies to the absence of considerable variability of ore bodies

  11. Algebraic methods in statistical mechanics and quantum field theory

    CERN Document Server

    Emch, Dr Gérard G

    2009-01-01

    This systematic algebraic approach concerns problems involving a large number of degrees of freedom. It extends the traditional formalism of quantum mechanics, and it eliminates conceptual and mathematical difficulties common to the development of statistical mechanics and quantum field theory. Further, the approach is linked to research in applied and pure mathematics, offering a reflection of the interplay between formulation of physical motivations and self-contained descriptions of the mathematical methods.The four-part treatment begins with a survey of algebraic approaches to certain phys

  12. Statistical methods for determining the effect of mammography screening

    DEFF Research Database (Denmark)

    Lophaven, Søren

    2016-01-01

    In an overview of five randomised controlled trials from Sweden, a reduction of 29% was found in breast cancer mortality in women aged 50-69 at randomisation after a follow up of 5-13 years. Organised, population based, mammography service screening was introduced on the basis of these resultsin...... in 2007-2008. Women aged 50-69 were invited to screening every second year. Taking advantage of the registers of population and health, we present statistical methods for evaluating the effect of mammography screening on breast cancer mortality (Olsen et al. 2005, Njor et al. 2015 and Weedon-Fekjær etal...

  13. Nonparametric correlation models for portfolio allocation

    DEFF Research Database (Denmark)

    Aslanidis, Nektarios; Casas, Isabel

    2013-01-01

    This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....

  14. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  15. A method for statistically comparing spatial distribution maps

    Directory of Open Access Journals (Sweden)

    Reynolds Mary G

    2009-01-01

    Full Text Available Abstract Background Ecological niche modeling is a method for estimation of species distributions based on certain ecological parameters. Thus far, empirical determination of significant differences between independently generated distribution maps for a single species (maps which are created through equivalent processes, but with different ecological input parameters, has been challenging. Results We describe a method for comparing model outcomes, which allows a statistical evaluation of whether the strength of prediction and breadth of predicted areas is measurably different between projected distributions. To create ecological niche models for statistical comparison, we utilized GARP (Genetic Algorithm for Rule-Set Production software to generate ecological niche models of human monkeypox in Africa. We created several models, keeping constant the case location input records for each model but varying the ecological input data. In order to assess the relative importance of each ecological parameter included in the development of the individual predicted distributions, we performed pixel-to-pixel comparisons between model outcomes and calculated the mean difference in pixel scores. We used a two sample Student's t-test, (assuming as null hypothesis that both maps were identical to each other regardless of which input parameters were used to examine whether the mean difference in corresponding pixel scores from one map to another was greater than would be expected by chance alone. We also utilized weighted kappa statistics, frequency distributions, and percent difference to look at the disparities in pixel scores. Multiple independent statistical tests indicated precipitation as the single most important independent ecological parameter in the niche model for human monkeypox disease. Conclusion In addition to improving our understanding of the natural factors influencing the distribution of human monkeypox disease, such pixel-to-pixel comparison

  16. Statistical methods in the mechanical design of fuel assemblies

    Energy Technology Data Exchange (ETDEWEB)

    Radsak, C.; Streit, D.; Muench, C.J. [AREVA NP GmbH, Erlangen (Germany)

    2013-07-01

    The mechanical design of a fuel assembly is still being mainly performed in a de terministic way. This conservative approach is however not suitable to provide a realistic quantification of the design margins with respect to licensing criter ia for more and more demanding operating conditions (power upgrades, burnup increase,..). This quantification can be provided by statistical methods utilizing all available information (e.g. from manufacturing, experience feedback etc.) of the topic under consideration. During optimization e.g. of the holddown system certain objectives in the mechanical design of a fuel assembly (FA) can contradict each other, such as sufficient holddown forces enough to prevent fuel assembly lift-off and reducing the holddown forces to minimize axial loads on the fuel assembly structure to ensure no negative effect on the control rod movement.By u sing a statistical method the fuel assembly design can be optimized much better with respect to these objectives than it would be possible based on a deterministic approach. This leads to a more realistic assessment and safer way of operating fuel assemblies. Statistical models are defined on the one hand by the quanti le that has to be maintained concerning the design limit requirements (e.g. one FA quantile) and on the other hand by the confidence level which has to be met. Using the above example of the holddown force, a feasible quantile can be define d based on the requirement that less than one fuel assembly (quantile > 192/19 3 [%] = 99.5 %) in the core violates the holddown force limit w ith a confidence of 95%. (orig.)

  17. Statistics

    Science.gov (United States)

    Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.

  18. A novel statistical method for classifying habitat generalists and specialists

    DEFF Research Database (Denmark)

    Chazdon, Robin L; Chao, Anne; Colwell, Robert K

    2011-01-01

    in second-growth (SG) and old-growth (OG) rain forests in the Caribbean lowlands of northeastern Costa Rica. We evaluate the multinomial model in detail for the tree data set. Our results for birds were highly concordant with a previous nonstatistical classification, but our method classified a higher......: (1) generalist; (2) habitat A specialist; (3) habitat B specialist; and (4) too rare to classify with confidence. We illustrate our multinomial classification method using two contrasting data sets: (1) bird abundance in woodland and heath habitats in southeastern Australia and (2) tree abundance...... fraction (57.7%) of bird species with statistical confidence. Based on a conservative specialization threshold and adjustment for multiple comparisons, 64.4% of tree species in the full sample were too rare to classify with confidence. Among the species classified, OG specialists constituted the largest...

  19. Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.

    Science.gov (United States)

    Allman, Elizabeth S; Rhodes, John A; Sullivant, Seth

    2017-02-01

    Frequencies of k-mers in sequences are sometimes used as a basis for inferring phylogenetic trees without first obtaining a multiple sequence alignment. We show that a standard approach of using the squared Euclidean distance between k-mer vectors to approximate a tree metric can be statistically inconsistent. To remedy this, we derive model-based distance corrections for orthologous sequences without gaps, which lead to consistent tree inference. The identifiability of model parameters from k-mer frequencies is also studied. Finally, we report simulations showing that the corrected distance outperforms many other k-mer methods, even when sequences are generated with an insertion and deletion process. These results have implications for multiple sequence alignment as well since k-mer methods are usually the first step in constructing a guide tree for such algorithms.

  20. Bayesian nonparametric meta-analysis using Polya tree mixture models.

    Science.gov (United States)

    Branscum, Adam J; Hanson, Timothy E

    2008-09-01

    Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.

  1. Safety bey statistics? A critical view on statistical methods applied in health physics

    International Nuclear Information System (INIS)

    Kraut, W.

    2016-01-01

    The only proper way to describe uncertainties in health physics is by statistical means. But statistics never can replace Your personal evaluation of effect, nor can statistics transmute randomness into certainty like an ''uncertainty laundry''. The paper discusses these problems in routine practical work.

  2. A Statistic-Based Calibration Method for TIADC System

    Directory of Open Access Journals (Sweden)

    Kuojun Yang

    2015-01-01

    Full Text Available Time-interleaved technique is widely used to increase the sampling rate of analog-to-digital converter (ADC. However, the channel mismatches degrade the performance of time-interleaved ADC (TIADC. Therefore, a statistic-based calibration method for TIADC is proposed in this paper. The average value of sampling points is utilized to calculate offset error, and the summation of sampling points is used to calculate gain error. After offset and gain error are obtained, they are calibrated by offset and gain adjustment elements in ADC. Timing skew is calibrated by an iterative method. The product of sampling points of two adjacent subchannels is used as a metric for calibration. The proposed method is employed to calibrate mismatches in a four-channel 5 GS/s TIADC system. Simulation results show that the proposed method can estimate mismatches accurately in a wide frequency range. It is also proved that an accurate estimation can be obtained even if the signal noise ratio (SNR of input signal is 20 dB. Furthermore, the results obtained from a real four-channel 5 GS/s TIADC system demonstrate the effectiveness of the proposed method. We can see that the spectra spurs due to mismatches have been effectively eliminated after calibration.

  3. Nonparametric Bayesian inference in biostatistics

    CERN Document Server

    Müller, Peter

    2015-01-01

    As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...

  4. Are Statistics Labs Worth the Effort?--Comparison of Introductory Statistics Courses Using Different Teaching Methods

    Directory of Open Access Journals (Sweden)

    Jose H. Guardiola

    2010-01-01

    Full Text Available This paper compares the academic performance of students in three similar elementary statistics courses taught by the same instructor, but with the lab component differing among the three. One course is traditionally taught without a lab component; the second with a lab component using scenarios and an extensive use of technology, but without explicit coordination between lab and lecture; and the third using a lab component with an extensive use of technology that carefully coordinates the lab with the lecture. Extensive use of technology means, in this context, using Minitab software in the lab section, doing homework and quizzes using MyMathlab ©, and emphasizing interpretation of computer output during lectures. Initially, an online instrument based on Gardner’s multiple intelligences theory, is given to students to try to identify students’ learning styles and intelligence types as covariates. An analysis of covariance is performed in order to compare differences in achievement. In this study there is no attempt to measure difference in student performance across the different treatments. The purpose of this study is to find indications of associations among variables that support the claim that statistics labs could be associated with superior academic achievement in one of these three instructional environments. Also, this study tries to identify individual student characteristics that could be associated with superior academic performance. This study did not find evidence of any individual student characteristics that could be associated with superior achievement. The response variable was computed as percentage of correct answers for the three exams during the semester added together. The results of this study indicate a significant difference across these three different instructional methods, showing significantly higher mean scores for the response variable on students taking the lab component that was carefully coordinated with

  5. Nonparametric e-Mixture Estimation.

    Science.gov (United States)

    Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

    2016-12-01

    This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.

  6. Statistics

    International Nuclear Information System (INIS)

    2005-01-01

    For the years 2004 and 2005 the figures shown in the tables of Energy Review are partly preliminary. The annual statistics published in Energy Review are presented in more detail in a publication called Energy Statistics that comes out yearly. Energy Statistics also includes historical time-series over a longer period of time (see e.g. Energy Statistics, Statistics Finland, Helsinki 2004.) The applied energy units and conversion coefficients are shown in the back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes, precautionary stock fees and oil pollution fees

  7. Statistical methods for mechanistic model validation: Salt Repository Project

    International Nuclear Information System (INIS)

    Eggett, D.L.

    1988-07-01

    As part of the Department of Energy's Salt Repository Program, Pacific Northwest Laboratory (PNL) is studying the emplacement of nuclear waste containers in a salt repository. One objective of the SRP program is to develop an overall waste package component model which adequately describes such phenomena as container corrosion, waste form leaching, spent fuel degradation, etc., which are possible in the salt repository environment. The form of this model will be proposed, based on scientific principles and relevant salt repository conditions with supporting data. The model will be used to predict the future characteristics of the near field environment. This involves several different submodels such as the amount of time it takes a brine solution to contact a canister in the repository, how long it takes a canister to corrode and expose its contents to the brine, the leach rate of the contents of the canister, etc. These submodels are often tested in a laboratory and should be statistically validated (in this context, validate means to demonstrate that the model adequately describes the data) before they can be incorporated into the waste package component model. This report describes statistical methods for validating these models. 13 refs., 1 fig., 3 tabs

  8. Statistical methods of evaluating and comparing imaging techniques

    International Nuclear Information System (INIS)

    Freedman, L.S.

    1987-01-01

    Over the past 20 years several new methods of generating images of internal organs and the anatomy of the body have been developed and used to enhance the accuracy of diagnosis and treatment. These include ultrasonic scanning, radioisotope scanning, computerised X-ray tomography (CT) and magnetic resonance imaging (MRI). The new techniques have made a considerable impact on radiological practice in hospital departments, not least on the investigational process for patients suspected or known to have malignant disease. As a consequence of the increased range of imaging techniques now available, there has developed a need to evaluate and compare their usefulness. Over the past 10 years formal studies of the application of imaging technology have been conducted and many reports have appeared in the literature. These studies cover a range of clinical situations. Likewise, the methodologies employed for evaluating and comparing the techniques in question have differed widely. While not attempting an exhaustive review of the clinical studies which have been reported, this paper aims to examine the statistical designs and analyses which have been used. First a brief review of the different types of study is given. Examples of each type are then chosen to illustrate statistical issues related to their design and analysis. In the final sections it is argued that a form of classification for these different types of study might be helpful in clarifying relationships between them and bringing a perspective to the field. A classification based upon a limited analogy with clinical trials is suggested

  9. A general approach to posterior contraction in nonparametric inverse problems

    NARCIS (Netherlands)

    Knapik, Bartek; Salomond, Jean Bernard

    In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related

  10. A semi-nonparametric mixture model for selecting functionally consistent proteins.

    Science.gov (United States)

    Yu, Lianbo; Doerge, Rw

    2010-09-28

    High-throughput technologies have led to a new era of proteomics. Although protein microarray experiments are becoming more common place there are a variety of experimental and statistical issues that have yet to be addressed, and that will carry over to new high-throughput technologies unless they are investigated. One of the largest of these challenges is the selection of functionally consistent proteins. We present a novel semi-nonparametric mixture model for classifying proteins as consistent or inconsistent while controlling the false discovery rate and the false non-discovery rate. The performance of the proposed approach is compared to current methods via simulation under a variety of experimental conditions. We provide a statistical method for selecting functionally consistent proteins in the context of protein microarray experiments, but the proposed semi-nonparametric mixture model method can certainly be generalized to solve other mixture data problems. The main advantage of this approach is that it provides the posterior probability of consistency for each protein.

  11. Development and testing of improved statistical wind power forecasting methods.

    Energy Technology Data Exchange (ETDEWEB)

    Mendes, J.; Bessa, R.J.; Keko, H.; Sumaili, J.; Miranda, V.; Ferreira, C.; Gama, J.; Botterud, A.; Zhou, Z.; Wang, J. (Decision and Information Sciences); (INESC Porto)

    2011-12-06

    (with spatial and/or temporal dependence). Statistical approaches to uncertainty forecasting basically consist of estimating the uncertainty based on observed forecasting errors. Quantile regression (QR) is currently a commonly used approach in uncertainty forecasting. In Chapter 3, we propose new statistical approaches to the uncertainty estimation problem by employing kernel density forecast (KDF) methods. We use two estimators in both offline and time-adaptive modes, namely, the Nadaraya-Watson (NW) and Quantilecopula (QC) estimators. We conduct detailed tests of the new approaches using QR as a benchmark. One of the major issues in wind power generation are sudden and large changes of wind power output over a short period of time, namely ramping events. In Chapter 4, we perform a comparative study of existing definitions and methodologies for ramp forecasting. We also introduce a new probabilistic method for ramp event detection. The method starts with a stochastic algorithm that generates wind power scenarios, which are passed through a high-pass filter for ramp detection and estimation of the likelihood of ramp events to happen. The report is organized as follows: Chapter 2 presents the results of the application of ITL training criteria to deterministic WPF; Chapter 3 reports the study on probabilistic WPF, including new contributions to wind power uncertainty forecasting; Chapter 4 presents a new method to predict and visualize ramp events, comparing it with state-of-the-art methodologies; Chapter 5 briefly summarizes the main findings and contributions of this report.

  12. Data and statistical methods for analysis of trends and patterns

    International Nuclear Information System (INIS)

    Atwood, C.L.; Gentillon, C.D.; Wilson, G.E.

    1992-11-01

    This report summarizes topics considered at a working meeting on data and statistical methods for analysis of trends and patterns in US commercial nuclear power plants. This meeting was sponsored by the Office of Analysis and Evaluation of Operational Data (AEOD) of the Nuclear Regulatory Commission (NRC). Three data sets are briefly described: Nuclear Plant Reliability Data System (NPRDS), Licensee Event Report (LER) data, and Performance Indicator data. Two types of study are emphasized: screening studies, to see if any trends or patterns appear to be present; and detailed studies, which are more concerned with checking the analysis assumptions, modeling any patterns that are present, and searching for causes. A prescription is given for a screening study, and ideas are suggested for a detailed study, when the data take of any of three forms: counts of events per time, counts of events per demand, and non-event data

  13. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.

    2012-12-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.

  14. Statistics

    International Nuclear Information System (INIS)

    2001-01-01

    For the year 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions from the use of fossil fuels, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in 2000, Energy exports by recipient country in 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  15. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g., Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-March 2000, Energy exports by recipient country in January-March 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  16. Statistics

    International Nuclear Information System (INIS)

    1999-01-01

    For the year 1998 and the year 1999, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy Review appear in more detail from the publication Energiatilastot - Energy Statistics issued annually, which also includes historical time series over a longer period (see e.g. Energiatilastot 1998, Statistics Finland, Helsinki 1999, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 1999, Energy exports by recipient country in January-June 1999, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  17. Implementation of statistical analysis methods for medical physics data

    International Nuclear Information System (INIS)

    Teixeira, Marilia S.; Pinto, Nivia G.P.; Barroso, Regina C.; Oliveira, Luis F.

    2009-01-01

    The objective of biomedical research with different radiation natures is to contribute for the understanding of the basic physics and biochemistry of the biological systems, the disease diagnostic and the development of the therapeutic techniques. The main benefits are: the cure of tumors through the therapy, the anticipated detection of diseases through the diagnostic, the using as prophylactic mean for blood transfusion, etc. Therefore, for the better understanding of the biological interactions occurring after exposure to radiation, it is necessary for the optimization of therapeutic procedures and strategies for reduction of radioinduced effects. The group pf applied physics of the Physics Institute of UERJ have been working in the characterization of biological samples (human tissues, teeth, saliva, soil, plants, sediments, air, water, organic matrixes, ceramics, fossil material, among others) using X-rays diffraction and X-ray fluorescence. The application of these techniques for measurement, analysis and interpretation of the biological tissues characteristics are experimenting considerable interest in the Medical and Environmental Physics. All quantitative data analysis must be initiated with descriptive statistic calculation (means and standard deviations) in order to obtain a previous notion on what the analysis will reveal. It is well known que o high values of standard deviation found in experimental measurements of biologicals samples can be attributed to biological factors, due to the specific characteristics of each individual (age, gender, environment, alimentary habits, etc). This work has the main objective the development of a program for the use of specific statistic methods for the optimization of experimental data an analysis. The specialized programs for this analysis are proprietary, another objective of this work is the implementation of a code which is free and can be shared by the other research groups. As the program developed since the

  18. Statistically qualified neuro-analytic failure detection method and system

    Science.gov (United States)

    Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.

    2002-03-02

    An apparatus and method for monitoring a process involve development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two stages: deterministic model adaption and stochastic model modification of the deterministic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics, augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation error minimization technique. Stochastic model modification involves qualifying any remaining uncertainty in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system. Illustrative of the method and apparatus, the method is applied to a peristaltic pump system.

  19. Improved Statistical Method For Hydrographic Climatic Records Quality Control

    Science.gov (United States)

    Gourrion, J.; Szekely, T.

    2016-02-01

    Climate research benefits from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of a quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to early 2014, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has been implemented in the latest version of the CORA dataset and will benefit to the next version of the Copernicus CMEMS dataset.

  20. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    CERN Document Server

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  1. A new method to determine the number of experimental data using statistical modeling methods

    Energy Technology Data Exchange (ETDEWEB)

    Jung, Jung-Ho; Kang, Young-Jin; Lim, O-Kaung; Noh, Yoojeong [Pusan National University, Busan (Korea, Republic of)

    2017-06-15

    For analyzing the statistical performance of physical systems, statistical characteristics of physical parameters such as material properties need to be estimated by collecting experimental data. For accurate statistical modeling, many such experiments may be required, but data are usually quite limited owing to the cost and time constraints of experiments. In this study, a new method for determining a rea- sonable number of experimental data is proposed using an area metric, after obtaining statistical models using the information on the underlying distribution, the Sequential statistical modeling (SSM) approach, and the Kernel density estimation (KDE) approach. The area metric is used as a convergence criterion to determine the necessary and sufficient number of experimental data to be acquired. The pro- posed method is validated in simulations, using different statistical modeling methods, different true models, and different convergence criteria. An example data set with 29 data describing the fatigue strength coefficient of SAE 950X is used for demonstrating the performance of the obtained statistical models that use a pre-determined number of experimental data in predicting the probability of failure for a target fatigue life.

  2. Assessment Methods in Statistical Education An International Perspective

    CERN Document Server

    Bidgood, Penelope; Jolliffe, Flavia

    2010-01-01

    This book is a collaboration from leading figures in statistical education and is designed primarily for academic audiences involved in teaching statistics and mathematics. The book is divided in four sections: (1) Assessment using real-world problems, (2) Assessment statistical thinking, (3) Individual assessment (4) Successful assessment strategies.

  3. Statistical analysis and interpretation of prenatal diagnostic imaging studies, Part 2: descriptive and inferential statistical methods.

    Science.gov (United States)

    Tuuli, Methodius G; Odibo, Anthony O

    2011-08-01

    The objective of this article is to discuss the rationale for common statistical tests used for the analysis and interpretation of prenatal diagnostic imaging studies. Examples from the literature are used to illustrate descriptive and inferential statistics. The uses and limitations of linear and logistic regression analyses are discussed in detail.

  4. Quality in statistics education : Determinants of course outcomes in methods & statistics education at universities and colleges

    NARCIS (Netherlands)

    Verhoeven, P.S.

    2009-01-01

    Although Statistics is not a very popular course according to most students, a majority of students still take it, as it is mandatory at most Social Science departments. Therefore it takes special teacher’s skills to teach statistics. In order to do so it is essential for teachers to know what

  5. A Normalization-Free and Nonparametric Method Sharpens Large-Scale Transcriptome Analysis and Reveals Common Gene Alteration Patterns in Cancers.

    Science.gov (United States)

    Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng

    2017-01-01

    Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.

  6. Statistics

    International Nuclear Information System (INIS)

    2003-01-01

    For the year 2002, part of the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot 2001, Statistics Finland, Helsinki 2002). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supply and total consumption of electricity GWh, Energy imports by country of origin in January-June 2003, Energy exports by recipient country in January-June 2003, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees on energy products

  7. Statistics

    International Nuclear Information System (INIS)

    2004-01-01

    For the year 2003 and 2004, the figures shown in the tables of the Energy Review are partly preliminary. The annual statistics of the Energy Review also includes historical time-series over a longer period (see e.g. Energiatilastot, Statistics Finland, Helsinki 2003, ISSN 0785-3165). The applied energy units and conversion coefficients are shown in the inside back cover of the Review. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in GDP, energy consumption and electricity consumption, Carbon dioxide emissions from fossile fuels use, Coal consumption, Consumption of natural gas, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices in heat production, Fuel prices in electricity production, Price of electricity by type of consumer, Average monthly spot prices at the Nord pool power exchange, Total energy consumption by source and CO 2 -emissions, Supplies and total consumption of electricity GWh, Energy imports by country of origin in January-March 2004, Energy exports by recipient country in January-March 2004, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Price of natural gas by type of consumer, Price of electricity by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Excise taxes, precautionary stock fees on oil pollution fees

  8. Statistics

    International Nuclear Information System (INIS)

    2000-01-01

    For the year 1999 and 2000, part of the figures shown in the tables of the Energy Review are preliminary or estimated. The annual statistics of the Energy also includes historical time series over a longer period (see e.g., Energiatilastot 1999, Statistics Finland, Helsinki 2000, ISSN 0785-3165). The inside of the Review's back cover shows the energy units and the conversion coefficients used for them. Explanatory notes to the statistical tables can be found after tables and figures. The figures presents: Changes in the volume of GNP and energy consumption, Changes in the volume of GNP and electricity, Coal consumption, Natural gas consumption, Peat consumption, Domestic oil deliveries, Import prices of oil, Consumer prices of principal oil products, Fuel prices for heat production, Fuel prices for electricity production, Carbon dioxide emissions, Total energy consumption by source and CO 2 -emissions, Electricity supply, Energy imports by country of origin in January-June 2000, Energy exports by recipient country in January-June 2000, Consumer prices of liquid fuels, Consumer prices of hard coal, natural gas and indigenous fuels, Average electricity price by type of consumer, Price of district heating by type of consumer, Excise taxes, value added taxes and fiscal charges and fees included in consumer prices of some energy sources and Energy taxes and precautionary stock fees on oil products

  9. Statistical method to compare massive parallel sequencing pipelines.

    Science.gov (United States)

    Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P

    2017-03-01

    Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.

  10. CATDAT : A Program for Parametric and Nonparametric Categorical Data Analysis : User's Manual Version 1.0, 1998-1999 Progress Report.

    Energy Technology Data Exchange (ETDEWEB)

    Peterson, James T.

    1999-12-01

    Natural resource professionals are increasingly required to develop rigorous statistical models that relate environmental data to categorical responses data. Recent advances in the statistical and computing sciences have led to the development of sophisticated methods for parametric and nonparametric analysis of data with categorical responses. The statistical software package CATDAT was designed to make some of these relatively new and powerful techniques available to scientists. The CATDAT statistical package includes 4 analytical techniques: generalized logit modeling; binary classification tree; extended K-nearest neighbor classification; and modular neural network.

  11. kruX: matrix-based non-parametric eQTL discovery.

    Science.gov (United States)

    Qi, Jianlong; Asl, Hassan Foroughi; Björkegren, Johan; Michoel, Tom

    2014-01-14

    The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com.

  12. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

    Science.gov (United States)

    Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

    2011-04-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.

  13. Hydrologic extremes - an intercomparison of multiple gridded statistical downscaling methods

    Science.gov (United States)

    Werner, Arelia T.; Cannon, Alex J.

    2016-04-01

    Gridded statistical downscaling methods are the main means of preparing climate model data to drive distributed hydrological models. Past work on the validation of climate downscaling methods has focused on temperature and precipitation, with less attention paid to the ultimate outputs from hydrological models. Also, as attention shifts towards projections of extreme events, downscaling comparisons now commonly assess methods in terms of climate extremes, but hydrologic extremes are less well explored. Here, we test the ability of gridded downscaling models to replicate historical properties of climate and hydrologic extremes, as measured in terms of temporal sequencing (i.e. correlation tests) and distributional properties (i.e. tests for equality of probability distributions). Outputs from seven downscaling methods - bias correction constructed analogues (BCCA), double BCCA (DBCCA), BCCA with quantile mapping reordering (BCCAQ), bias correction spatial disaggregation (BCSD), BCSD using minimum/maximum temperature (BCSDX), the climate imprint delta method (CI), and bias corrected CI (BCCI) - are used to drive the Variable Infiltration Capacity (VIC) model over the snow-dominated Peace River basin, British Columbia. Outputs are tested using split-sample validation on 26 climate extremes indices (ClimDEX) and two hydrologic extremes indices (3-day peak flow and 7-day peak flow). To characterize observational uncertainty, four atmospheric reanalyses are used as climate model surrogates and two gridded observational data sets are used as downscaling target data. The skill of the downscaling methods generally depended on reanalysis and gridded observational data set. However, CI failed to reproduce the distribution and BCSD and BCSDX the timing of winter 7-day low-flow events, regardless of reanalysis or observational data set. Overall, DBCCA passed the greatest number of tests for the ClimDEX indices, while BCCAQ, which is designed to more accurately resolve event

  14. Multiresolution, Geometric, and Learning Methods in Statistical Image Processing, Object Recognition, and Sensor Fusion

    National Research Council Canada - National Science Library

    Willsky, Alan

    2004-01-01

    .... Our research blends methods from several fields-statistics and probability, signal and image processing, mathematical physics, scientific computing, statistical learning theory, and differential...

  15. Improved statistical method for temperature and salinity quality control

    Science.gov (United States)

    Gourrion, Jérôme; Szekely, Tanguy

    2017-04-01

    Climate research and Ocean monitoring benefit from the continuous development of global in-situ hydrographic networks in the last decades. Apart from the increasing volume of observations available on a large range of temporal and spatial scales, a critical aspect concerns the ability to constantly improve the quality of the datasets. In the context of the Coriolis Dataset for ReAnalysis (CORA) version 4.2, a new quality control method based on a local comparison to historical extreme values ever observed is developed, implemented and validated. Temperature, salinity and potential density validity intervals are directly estimated from minimum and maximum values from an historical reference dataset, rather than from traditional mean and standard deviation estimates. Such an approach avoids strong statistical assumptions on the data distributions such as unimodality, absence of skewness and spatially homogeneous kurtosis. As a new feature, it also allows addressing simultaneously the two main objectives of an automatic quality control strategy, i.e. maximizing the number of good detections while minimizing the number of false alarms. The reference dataset is presently built from the fusion of 1) all ARGO profiles up to late 2015, 2) 3 historical CTD datasets and 3) the Sea Mammals CTD profiles from the MEOP database. All datasets are extensively and manually quality controlled. In this communication, the latest method validation results are also presented. The method has already been implemented in the latest version of the delayed-time CMEMS in-situ dataset and will be deployed soon in the equivalent near-real time products.

  16. Statistical error estimation of the Feynman-α method using the bootstrap method

    International Nuclear Information System (INIS)

    Endo, Tomohiro; Yamamoto, Akio; Yagi, Takahiro; Pyeon, Cheol Ho

    2016-01-01

    Applicability of the bootstrap method is investigated to estimate the statistical error of the Feynman-α method, which is one of the subcritical measurement techniques on the basis of reactor noise analysis. In the Feynman-α method, the statistical error can be simply estimated from multiple measurements of reactor noise, however it requires additional measurement time to repeat the multiple times of measurements. Using a resampling technique called 'bootstrap method' standard deviation and confidence interval of measurement results obtained by the Feynman-α method can be estimated as the statistical error, using only a single measurement of reactor noise. In order to validate our proposed technique, we carried out a passive measurement of reactor noise without any external source, i.e. with only inherent neutron source by spontaneous fission and (α,n) reactions in nuclear fuels at the Kyoto University Criticality Assembly. Through the actual measurement, it is confirmed that the bootstrap method is applicable to approximately estimate the statistical error of measurement results obtained by the Feynman-α method. (author)

  17. Screen Wars, Star Wars, and Sequels: Nonparametric Reanalysis of Movie Profitability

    OpenAIRE

    W. D. Walls

    2012-01-01

    In this paper we use nonparametric statistical tools to quantify motion-picture profit. We quantify the unconditional distribution of profit, the distribution of profit conditional on stars and sequels, and we also model the conditional expectation of movie profits using a non- parametric data-driven regression model. The flexibility of the non-parametric approach accommodates the full range of possible relationships among the variables without prior specification of a functional form, thereb...

  18. A statistical method for draft tube pressure pulsation analysis

    International Nuclear Information System (INIS)

    Doerfler, P K; Ruchonnet, N

    2012-01-01

    Draft tube pressure pulsation (DTPP) in Francis turbines is composed of various components originating from different physical phenomena. These components may be separated because they differ by their spatial relationships and by their propagation mechanism. The first step for such an analysis was to distinguish between so-called synchronous and asynchronous pulsations; only approximately periodic phenomena could be described in this manner. However, less regular pulsations are always present, and these become important when turbines have to operate in the far off-design range, in particular at very low load. The statistical method described here permits to separate the stochastic (random) component from the two traditional 'regular' components. It works in connection with the standard technique of model testing with several pressure signals measured in draft tube cone. The difference between the individual signals and the averaged pressure signal, together with the coherence between the individual pressure signals is used for analysis. An example reveals that a generalized, non-periodic version of the asynchronous pulsation is important at low load.

  19. Information Geometry, Inference Methods and Chaotic Energy Levels Statistics

    OpenAIRE

    Cafaro, Carlo

    2008-01-01

    In this Letter, we propose a novel information-geometric characterization of chaotic (integrable) energy level statistics of a quantum antiferromagnetic Ising spin chain in a tilted (transverse) external magnetic field. Finally, we conjecture our results might find some potential physical applications in quantum energy level statistics.

  20. Statistical methods for decision making in mine action

    DEFF Research Database (Denmark)

    Larsen, Jan

    The lecture discusses the basics of statistical decision making in connection with humanitarian mine action. There is special focus on: 1) requirements for mine detection; 2) design and evaluation of mine equipment; 3) performance improvement by statistical learning and information fusion; 4...

  1. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    Science.gov (United States)

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.

  2. Nonparametric Bayesian Modeling of Complex Networks

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Mørup, Morten

    2013-01-01

    an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...

  3. Nonparametric tests for equality of psychometric functions.

    Science.gov (United States)

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2017-12-07

    Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.

  4. Statistics a guide to the use of statistical methods in the physical sciences

    CERN Document Server

    Barlow, Roger J

    1989-01-01

    The Manchester Physics Series General Editors: D. J. Sandiford; F. Mandl; A. C. Phillips Department of Physics and Astronomy, University of Manchester Properties of Matter B. H. Flowers and E. Mendoza Optics Second Edition F. G. Smith and J. H. Thomson Statistical Physics Second Edition F. Mandl Electromagnetism Second Edition I. S. Grant and W. R. Phillips Statistics R. J. Barlow Solid State Physics Second Edition J. R. Hook and H. E. Hall Quantum Mechanics F. Mandl Particle Physics Second Edition B. R. Martin and G. Shaw The Physics of Stars Second Edition A.C. Phillips Computing for Scienti

  5. Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering

    Directory of Open Access Journals (Sweden)

    Xin Tian

    2017-06-01

    Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.

  6. Robust Control Methods for On-Line Statistical Learning

    Directory of Open Access Journals (Sweden)

    Capobianco Enrico

    2001-01-01

    Full Text Available The issue of controlling that data processing in an experiment results not affected by the presence of outliers is relevant for statistical control and learning studies. Learning schemes should thus be tested for their capacity of handling outliers in the observed training set so to achieve reliable estimates with respect to the crucial bias and variance aspects. We describe possible ways of endowing neural networks with statistically robust properties by defining feasible error criteria. It is convenient to cast neural nets in state space representations and apply both Kalman filter and stochastic approximation procedures in order to suggest statistically robustified solutions for on-line learning.

  7. Statistics and scientific method: an introduction for students and researchers

    National Research Council Canada - National Science Library

    Diggle, Peter; Chetwynd, Amanda

    2011-01-01

    "Most introductory statistics text-books are written either in a highly mathematical style for an intended readership of mathematics undergraduate students, or in a recipe-book style for an intended...

  8. Using Statistical Process Control Methods to Classify Pilot Mental Workloads

    National Research Council Canada - National Science Library

    Kudo, Terence

    2001-01-01

    .... These include cardiac, ocular, respiratory, and brain activity measures. The focus of this effort is to apply statistical process control methodology on different psychophysiological features in an attempt to classify pilot mental workload...

  9. Sensory evaluation of food: statistical methods and procedures

    National Research Council Canada - National Science Library

    O'Mahony, Michael

    1986-01-01

    The aim of this book is to provide basic knowledge of the logic and computation of statistics for the sensory evaluation of food, or for other forms of sensory measurement encountered in, say, psychophysics...

  10. Essays on nonparametric econometrics of stochastic volatility

    NARCIS (Netherlands)

    Zu, Y.

    2012-01-01

    Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.

  11. Highly Robust Statistical Methods in Medical Image Analysis

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2012-01-01

    Roč. 32, č. 2 (2012), s. 3-16 ISSN 0208-5216 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust statistics * classification * faces * robust image analysis * forensic science Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.208, year: 2012 http://www.ibib.waw.pl/bbe/bbefulltext/BBE_32_2_003_FT.pdf

  12. Nonparametric Efficiency Testing of Asian Stock Markets Using Weekly Data

    OpenAIRE

    CORNELIS A. LOS

    2004-01-01

    The efficiency of speculative markets, as represented by Fama's 1970 fair game model, is tested on weekly price index data of six Asian stock markets - Hong Kong, Indonesia, Malaysia, Singapore, Taiwan and Thailand - using Sherry's (1992) non-parametric methods. These scientific testing methods were originally developed to analyze the information processing efficiency of nervous systems. In particular, the stationarity and independence of the price innovations are tested over ten years, from ...

  13. Teaching biology through statistics: application of statistical methods in genetics and zoology courses.

    Science.gov (United States)

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math-biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology.

  14. Cluster size statistic and cluster mass statistic: two novel methods for identifying changes in functional connectivity between groups or conditions.

    Science.gov (United States)

    Ing, Alex; Schwarzbauer, Christian

    2014-01-01

    Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.

  15. Statistical methods and applications from a historical perspective selected issues

    CERN Document Server

    Mignani, Stefania

    2014-01-01

    The book showcases a selection of peer-reviewed papers, the preliminary versions of which were presented at a conference held 11-13 June 2011 in Bologna and organized jointly by the Italian Statistical Society (SIS), the National Institute of Statistics (ISTAT) and the Bank of Italy. The theme of the conference was "Statistics in the 150 years of the Unification of Italy." The celebration of the anniversary of Italian unification provided the opportunity to examine and discuss the methodological aspects and applications from a historical perspective and both from a national and international point of view. The critical discussion on the issues of the past has made it possible to focus on recent advances, considering the studies of socio-economic and demographic changes in European countries.

  16. Statistical methods to evaluate thermoluminescence ionizing radiation dosimetry data

    International Nuclear Information System (INIS)

    Segre, Nadia; Matoso, Erika; Fagundes, Rosane Correa

    2011-01-01

    Ionizing radiation levels, evaluated through the exposure of CaF 2 :Dy thermoluminescence dosimeters (TLD- 200), have been monitored at Centro Experimental Aramar (CEA), located at Ipero in Sao Paulo state, Brazil, since 1991 resulting in a large amount of measurements until 2009 (more than 2,000). The data amount associated with measurements dispersion, since every process has deviation, reinforces the utilization of statistical tools to evaluate the results, procedure also imposed by the Brazilian Standard CNEN-NN-3.01/PR- 3.01-008 which regulates the radiometric environmental monitoring. Thermoluminescence ionizing radiation dosimetry data are statistically compared in order to evaluate potential CEA's activities environmental impact. The statistical tools discussed in this work are box plots, control charts and analysis of variance. (author)

  17. Statistical methods for quantitative mass spectrometry proteomic experiments with labeling

    Directory of Open Access Journals (Sweden)

    Oberg Ann L

    2012-11-01

    Full Text Available Abstract Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.

  18. Statistical methods for quantitative mass spectrometry proteomic experiments with labeling.

    Science.gov (United States)

    Oberg, Ann L; Mahoney, Douglas W

    2012-01-01

    Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.

  19. Adaptive Maneuvering Frequency Method of Current Statistical Model

    Institute of Scientific and Technical Information of China (English)

    Wei Sun; Yongjian Yang

    2017-01-01

    Current statistical model(CSM) has a good performance in maneuvering target tracking. However, the fixed maneuvering frequency will deteriorate the tracking results, such as a serious dynamic delay, a slowly converging speedy and a limited precision when using Kalman filter(KF) algorithm. In this study, a new current statistical model and a new Kalman filter are proposed to improve the performance of maneuvering target tracking. The new model which employs innovation dominated subjection function to adaptively adjust maneuvering frequency has a better performance in step maneuvering target tracking, while a fluctuant phenomenon appears. As far as this problem is concerned, a new adaptive fading Kalman filter is proposed as well. In the new Kalman filter, the prediction values are amended in time by setting judgment and amendment rules,so that tracking precision and fluctuant phenomenon of the new current statistical model are improved. The results of simulation indicate the effectiveness of the new algorithm and the practical guiding significance.

  20. Bayesian nonparametric adaptive control using Gaussian processes.

    Science.gov (United States)

    Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A

    2015-03-01

    Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.

  1. Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage

    NARCIS (Netherlands)

    Hayen, Andrew; Macaskill, Petra; Irwig, Les; Bossuyt, Patrick

    2010-01-01

    To explain which measures of accuracy and which statistical methods should be used in studies to assess the value of a new binary test as a replacement test, an add-on test, or a triage test. Selection and explanation of statistical methods, illustrated with examples. Statistical methods for

  2. Debating Curricular Strategies for Teaching Statistics and Research Methods: What Does the Current Evidence Suggest?

    Science.gov (United States)

    Barron, Kenneth E.; Apple, Kevin J.

    2014-01-01

    Coursework in statistics and research methods is a core requirement in most undergraduate psychology programs. However, is there an optimal way to structure and sequence methodology courses to facilitate student learning? For example, should statistics be required before research methods, should research methods be required before statistics, or…

  3. Fuzzy comprehensive evaluation method of F statistics weighting in ...

    African Journals Online (AJOL)

    In order to rapidly identify the source of water inrush in coal mine, and provide the theoretical basis for mine water damage prevention and control, fuzzy comprehensive evaluation model was established. The F statistics of water samples was normalized as the weight of fuzzy comprehensive evaluation for determining the ...

  4. Statistical methods for decision making in mine action

    DEFF Research Database (Denmark)

    Larsen, Jan

    The design and evaluation of mine clearance equipment – the problem of reliability * Detection probability – tossing a coin * Requirements in mine action * Detection probability and confidence in MA * Using statistics in area reduction Improving performance by information fusion and combination...

  5. Statistical methods of combining information: Applications to sensor data fusion

    Energy Technology Data Exchange (ETDEWEB)

    Burr, T.

    1996-12-31

    This paper reviews some statistical approaches to combining information from multiple sources. Promising new approaches will be described, and potential applications to combining not-so-different data sources such as sensor data will be discussed. Experiences with one real data set are described.

  6. An Introduction to Modern Statistical Methods in HCI

    NARCIS (Netherlands)

    Robertson, Judy; Kaptein, Maurits; Robertson, J; Kaptein, M

    2016-01-01

    This chapter explains why we think statistical methodology matters so much to the HCI community and why we should attempt to improve it. It introduces some flaws in the well-accepted methodology of Null Hypothesis Significance Testing and briefly introduces some alternatives. Throughout the book we

  7. An introduction to modern statistical methods in HCI

    NARCIS (Netherlands)

    Robertson, J.; Kaptein, M.C.; Robertson, J.; Kaptein, M.C.

    2016-01-01

    This chapter explains why we think statistical methodology matters so much to the HCI community and why we should attempt to improve it. It introduces some flaws in the well-accepted methodology of Null Hypothesis Significance Testing and briefly introduces some alternatives. Throughout the book we

  8. Effective viscosity of dispersions approached by a statistical continuum method

    NARCIS (Netherlands)

    Mellema, J.; Willemse, M.W.M.

    1983-01-01

    The problem of the determination of the effective viscosity of disperse systems (emulsions, suspensions) is considered. On the basis of the formal solution of the equations governing creeping flow in a statistically homogeneous dispersion, the effective viscosity is expressed in a series expansion

  9. Grassmann methods in lattice field theory and statistical mechanics

    International Nuclear Information System (INIS)

    Bilgici, E.; Gattringer, C.; Huber, P.

    2006-01-01

    Full text: In two dimensions models of loops can be represented as simple Grassmann integrals. In our work we explore the generalization of these techniques to lattice field theories and statistical mechanic systems in three and four dimensions. We discuss possible strategies and applications for representations of loop and surface models as Grassmann integrals. (author)

  10. Critical Realism and Statistical Methods--A Response to Nash

    Science.gov (United States)

    Scott, David

    2007-01-01

    This article offers a defence of critical realism in the face of objections Nash (2005) makes to it in a recent edition of this journal. It is argued that critical and scientific realisms are closely related and that both are opposed to statistical positivism. However, the suggestion is made that scientific realism retains (from statistical…

  11. Nonparametric Regression Estimation for Multivariate Null Recurrent Processes

    Directory of Open Access Journals (Sweden)

    Biqing Cai

    2015-04-01

    Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.

  12. Non-parametric estimation of the individual's utility map

    OpenAIRE

    Noguchi, Takao; Sanborn, Adam N.; Stewart, Neil

    2013-01-01

    Models of risky choice have attracted much attention in behavioural economics. Previous research has repeatedly demonstrated that individuals' choices are not well explained by expected utility theory, and a number of alternative models have been examined using carefully selected sets of choice alternatives. The model performance however, can depend on which choice alternatives are being tested. Here we develop a non-parametric method for estimating the utility map over the wide range of choi...

  13. Statistical methods for data analysis in particle physics

    CERN Document Server

    Lista, Luca

    2017-01-01

    This concise set of course-based notes provides the reader with the main concepts and tools needed to perform statistical analyses of experimental data, in particular in the field of high-energy physics (HEP). First, the book provides an introduction to probability theory and basic statistics, mainly intended as a refresher from readers’ advanced undergraduate studies, but also to help them clearly distinguish between the Frequentist and Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on both discoveries and upper limits, as many applications in HEP concern hypothesis testing, where the main goal is often to provide better and better limits so as to eventually be able to distinguish between competing hypotheses, or to rule out some of them altogether. Many worked-out examples will help newcomers to the field and graduate students alike understand the pitfalls involved in applying theoretical co...

  14. Reactor noise analysis by statistical pattern recognition methods

    International Nuclear Information System (INIS)

    Howington, L.C.; Gonzalez, R.C.

    1976-01-01

    A multivariate statistical pattern recognition system for reactor noise analysis is presented. The basis of the system is a transformation for decoupling correlated variables and algorithms for inferring probability density functions. The system is adaptable to a variety of statistical properties of the data, and it has learning, tracking, updating, and data compacting capabilities. System design emphasizes control of the false-alarm rate. Its abilities to learn normal patterns, to recognize deviations from these patterns, and to reduce the dimensionality of data with minimum error were evaluated by experiments at the Oak Ridge National Laboratory (ORNL) High-Flux Isotope Reactor. Power perturbations of less than 0.1 percent of the mean value in selected frequency ranges were detected by the pattern recognition system

  15. Statistical methods for data analysis in particle physics

    CERN Document Server

    AUTHOR|(CDS)2070643

    2015-01-01

    This concise set of course-based notes provides the reader with the main concepts and tools to perform statistical analysis of experimental data, in particular in the field of high-energy physics (HEP). First, an introduction to probability theory and basic statistics is given, mainly as reminder from advanced undergraduate studies, yet also in view to clearly distinguish the Frequentist versus Bayesian approaches and interpretations in subsequent applications. More advanced concepts and applications are gradually introduced, culminating in the chapter on upper limits as many applications in HEP concern hypothesis testing, where often the main goal is to provide better and better limits so as to be able to distinguish eventually between competing hypotheses or to rule out some of them altogether. Many worked examples will help newcomers to the field and graduate students to understand the pitfalls in applying theoretical concepts to actual data

  16. Statistics and finance an introduction

    CERN Document Server

    Ruppert, David

    2004-01-01

    This textbook emphasizes the applications of statistics and probability to finance. Students are assumed to have had a prior course in statistics, but no background in finance or economics. The basics of probability and statistics are reviewed and more advanced topics in statistics, such as regression, ARMA and GARCH models, the bootstrap, and nonparametric regression using splines, are introduced as needed. The book covers the classical methods of finance such as portfolio theory, CAPM, and the Black-Scholes formula, and it introduces the somewhat newer area of behavioral finance. Applications and use of MATLAB and SAS software are stressed. The book will serve as a text in courses aimed at advanced undergraduates and masters students in statistics, engineering, and applied mathematics as well as quantitatively oriented MBA students. Those in the finance industry wishing to know more statistics could also use it for self-study. David Ruppert is the Andrew Schultz, Jr. Professor of Engineering, School of Oper...

  17. Modern applied U-statistics

    CERN Document Server

    Kowalski, Jeanne

    2008-01-01

    A timely and applied approach to the newly discovered methods and applications of U-statisticsBuilt on years of collaborative research and academic experience, Modern Applied U-Statistics successfully presents a thorough introduction to the theory of U-statistics using in-depth examples and applications that address contemporary areas of study including biomedical and psychosocial research. Utilizing a "learn by example" approach, this book provides an accessible, yet in-depth, treatment of U-statistics, as well as addresses key concepts in asymptotic theory by integrating translational and cross-disciplinary research.The authors begin with an introduction of the essential and theoretical foundations of U-statistics such as the notion of convergence in probability and distribution, basic convergence results, stochastic Os, inference theory, generalized estimating equations, as well as the definition and asymptotic properties of U-statistics. With an emphasis on nonparametric applications when and where applic...

  18. METHODOLOGICAL PRINCIPLES AND METHODS OF TERMS OF TRADE STATISTICAL EVALUATION

    Directory of Open Access Journals (Sweden)

    N. Kovtun

    2014-09-01

    Full Text Available The paper studies the methodological principles and guidance of the statistical evaluation of terms of trade for the United Nations classification model – Harmonized Commodity Description and Coding System (HS. The practical implementation of the proposed three-stage model of index analysis and estimation of terms of trade for Ukraine's commodity-members for the period of 2011-2012 are realized.

  19. How to analyze germination of species with empty seeds using contemporary statistical methods?

    Directory of Open Access Journals (Sweden)

    Denise Garcia de Santana

    2018-02-01

    Full Text Available ABSTRACT Statistical analysis is considered an important tool for scientific studies, including those on seeds. However, seed scientists and statisticians often disagree on the nature of variables addressed in germination experiments. Statisticians consider the number of germinated seeds to be a binomially distributed variable, whereas seed scientists convert it into a percentage and often analyze it as a normally distributed variable. The requirement for normal adjustment restricts the models of analysis of variance that can be used. Lack of fit requires nonparametric tests, but they are known by their inferential problems. Generalized Linear Models (GLM can provide better fit to germination variables for any species, including Lychnophora ericoides Mart., because they allow wider probability distributions with fewer requirements. Here we suggest the use of relative germination besides absolute germination for species with seed development problems, such for L. ericoides and others from the campos rupestres. This paper introduces the most current statistical advancements and increases the possibilities for their application in seed science research.

  20. Statistical signal processing for gamma spectrometry: application for a pileup correction method

    International Nuclear Information System (INIS)

    Trigano, T.

    2005-12-01

    The main objective of gamma spectrometry is to characterize the radioactive elements of an unknown source by studying the energy of the emitted photons. When a photon interacts with a detector, its energy is converted into an electrical pulse. The histogram obtained by collecting the energies can be used to identify radioactive elements and measure their activity. However, at high counting rates, perturbations which are due to the stochastic aspect of the temporal signal can cripple the identification of the radioactive elements. More specifically, since the detector has a finite resolution, close arrival times of photons which can be modeled as an homogeneous Poisson process cause pile-ups of individual pulses. This phenomenon distorts energy spectra by introducing multiple fake spikes and prolonging artificially the Compton continuum, which can mask spikes of low intensity. The objective of this thesis is to correct the distortion caused by the pile-up phenomenon in the energy spectra. Since the shape of photonic pulses depends on many physical parameters, we consider this problem in a nonparametric framework. By introducing an adapted model based on two marked point processes, we establish a nonlinear relation between the probability measure associated to the observations and the probability density function we wish to estimate. This relation is derived both for continuous and for discrete time signals, and therefore can be used on a large set of detectors and from an analog or digital point of view. It also provides a framework to this problem, which can be considered as a problem of nonlinear density deconvolution and nonparametric density estimation from indirect measurements. Using these considerations, we propose an estimator obtained by direct inversion. We show that this estimator is consistent and almost achieves the usual rate of convergence obtained in classical nonparametric density estimation in the L 2 sense. We have applied our method to a set of

  1. Spatial Analysis Along Networks Statistical and Computational Methods

    CERN Document Server

    Okabe, Atsuyuki

    2012-01-01

    In the real world, there are numerous and various events that occur on and alongside networks, including the occurrence of traffic accidents on highways, the location of stores alongside roads, the incidence of crime on streets and the contamination along rivers. In order to carry out analyses of those events, the researcher needs to be familiar with a range of specific techniques. Spatial Analysis Along Networks provides a practical guide to the necessary statistical techniques and their computational implementation. Each chapter illustrates a specific technique, from Stochastic Point Process

  2. Statistical methods for segmentation and classification of images

    DEFF Research Database (Denmark)

    Rosholm, Anders

    1997-01-01

    The central matter of the present thesis is Bayesian statistical inference applied to classification of images. An initial review of Markov Random Fields relates to the modeling aspect of the indicated main subject. In that connection, emphasis is put on the relatively unknown sub-class of Pickard...... with a Pickard Random Field modeling of a considered (categorical) image phenomemon. An extension of the fast PRF based classification technique is presented. The modification introduces auto-correlation into the model of an involved noise process, which previously has been assumed independent. The suitability...... of the extended model is documented by tests on controlled image data containing auto-correlated noise....

  3. Method of statistical estimation of temperature minimums in binary systems

    International Nuclear Information System (INIS)

    Mireev, V.A.; Safonov, V.V.

    1985-01-01

    On the basis of statistical processing of literature data the technique for evaluation of temperature minima on liquidus curves in binary systems with common ion chloride systems being taken as an example, is developed. The systems are formed by 48 chlorides of 45 chemical elements including alkali, alkaline earth, rare earth and transition metals as well as Cd, In, Th. It is shown that calculation error in determining minimum melting points depends on topology of the phase diagram. The comparison of calculated and experimental data for several previously nonstudied systems is given

  4. Trends in statistical methods in articles published in Archives of Plastic Surgery between 2012 and 2017.

    Science.gov (United States)

    Han, Kyunghwa; Jung, Inkyung

    2018-05-01

    This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.

  5. SOME ASPECTS OF THE USE OF MATHEMATICAL-STATISTICAL METHODS IN THE ANALYSIS OF SOCIO-HUMANISTIC TEXTS Humanities and social text, mathematics, method, statistics, probability

    Directory of Open Access Journals (Sweden)

    Zaira M Alieva

    2016-01-01

    Full Text Available The article analyzes the application of mathematical and statistical methods in the analysis of socio-humanistic texts. The essence of mathematical and statistical methods, presents examples of their use in the study of Humanities and social phenomena. Considers the key issues faced by the expert in the application of mathematical-statistical methods in socio-humanitarian sphere, including the availability of sustainable contrasting socio-humanitarian Sciences and mathematics; the complexity of the allocation of the object that is the bearer of the problem; having the use of a probabilistic approach. The conclusion according to the results of the study.

  6. A statistical comparison of accelerated concrete testing methods

    OpenAIRE

    Denny Meyer

    1997-01-01

    Accelerated curing results, obtained after only 24 hours, are used to predict the 28 day strength of concrete. Various accelerated curing methods are available. Two of these methods are compared in relation to the accuracy of their predictions and the stability of the relationship between their 24 hour and 28 day concrete strength. The results suggest that Warm Water accelerated curing is preferable to Hot Water accelerated curing of concrete. In addition, some other methods for improving the...

  7. Evaluation of local corrosion life by statistical method

    International Nuclear Information System (INIS)

    Kato, Shunji; Kurosawa, Tatsuo; Takaku, Hiroshi; Kusanagi, Hideo; Hirano, Hideo; Kimura, Hideo; Hide, Koichiro; Kawasaki, Masayuki

    1987-01-01

    In this paper, for the purpose of achievement of life extension of light water reactor, we examined the evaluation of local corrosion by satistical method and its application of nuclear power plant components. There are many evaluation examples of maximum cracking depth of local corrosion by dowbly exponential distribution. This evaluation method has been established. But, it has not been established that we evaluate service lifes of construction materials by satistical method. In order to establish of service life evaluation by satistical method, we must strive to collect local corrosion dates and its analytical researchs. (author)

  8. Nonparametric statistics: a step-by-step approach

    National Research Council Canada - National Science Library

    Corder, Gregory W; Foreman, Dale I

    2014-01-01

    .... The book continues to follow the same format in all chapters to aid in reader comprehension, and each chapter begins with a general introduction and a list of the chapter's main learning objectives...

  9. Statistical methods for analysing responses of wildlife to human disturbance.

    Science.gov (United States)

    Haiganoush K. Preisler; Alan A. Ager; Michael J. Wisdom

    2006-01-01

    1. Off-road recreation is increasing rapidly in many areas of the world, and effects on wildlife can be highly detrimental. Consequently, we have developed methods for studying wildlife responses to off-road recreation with the use of new technologies that allow frequent and accurate monitoring of human-wildlife interactions. To illustrate these methods, we studied the...

  10. Introducing Students to the Application of Statistics and Investigative Methods in Political Science

    Science.gov (United States)

    Wells, Dominic D.; Nemire, Nathan A.

    2017-01-01

    This exercise introduces students to the application of statistics and its investigative methods in political science. It helps students gain a better understanding and a greater appreciation of statistics through a real world application.

  11. Use of Mathematical Methods of Statistics for Analyzing Engine Characteristics

    Directory of Open Access Journals (Sweden)

    Aivaras Jasilionis

    2012-11-01

    Full Text Available For the development of new models, automobile manufacturers are trying to come up with optimal software for engine control in all movement modes. However, in this case, a vehicle cannot reach outstanding characteristics in none of them. This is the main reason why modifications in engine control software used for adapting the vehicle for driver’s needs are becoming more and more popular. The article presents a short analysis of development trends towards engine control software. Also, models of mathematical statistics for engine power and torque growth are created. The introduced models give an opportunity to predict the probabilities of engine power or torque growth after individual reprogramming of engine control software.

  12. Statistical Methods and Tools for Hanford Staged Feed Tank Sampling

    Energy Technology Data Exchange (ETDEWEB)

    Fountain, Matthew S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Brigantic, Robert T. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Peterson, Reid A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2013-10-01

    This report summarizes work conducted by Pacific Northwest National Laboratory to technically evaluate the current approach to staged feed sampling of high-level waste (HLW) sludge to meet waste acceptance criteria (WAC) for transfer from tank farms to the Hanford Waste Treatment and Immobilization Plant (WTP). The current sampling and analysis approach is detailed in the document titled Initial Data Quality Objectives for WTP Feed Acceptance Criteria, 24590-WTP-RPT-MGT-11-014, Revision 0 (Arakali et al. 2011). The goal of this current work is to evaluate and provide recommendations to support a defensible, technical and statistical basis for the staged feed sampling approach that meets WAC data quality objectives (DQOs).

  13. A new quantum statistical evaluation method for time correlation functions

    International Nuclear Information System (INIS)

    Loss, D.; Schoeller, H.

    1989-01-01

    Considering a system of N identical interacting particles, which obey Fermi-Dirac or Bose-Einstein statistics, the authors derive new formulas for correlation functions of the type C(t) = i= 1 N A i (t) Σ j=1 N B j > (where B j is diagonal in the free-particle states) in the thermodynamic limit. Thereby they apply and extend a superoperator formalism, recently developed for the derivation of long-time tails in semiclassical systems. As an illustrative application, the Boltzmann equation value of the time-integrated correlation function C(t) is derived in a straight-forward manner. Due to exchange effects, the obtained t-matrix and the resulting scattering cross section, which occurs in the Boltzmann collision operator, are now functionals of the Fermi-Dirac or Bose-Einstein distribution

  14. A limited area model intercomparison on the 'Montserrat-2000' flash-flood event using statistical and deterministic methods

    Directory of Open Access Journals (Sweden)

    S. Mariani

    2005-01-01

    Full Text Available In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as 'Montserrat-2000' event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs, several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard 'eyeball' analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.

  15. Statistical Bayesian method for reliability evaluation based on ADT data

    Science.gov (United States)

    Lu, Dawei; Wang, Lizhi; Sun, Yusheng; Wang, Xiaohong

    2018-05-01

    Accelerated degradation testing (ADT) is frequently conducted in the laboratory to predict the products’ reliability under normal operating conditions. Two kinds of methods, degradation path models and stochastic process models, are utilized to analyze degradation data and the latter one is the most popular method. However, some limitations like imprecise solution process and estimation result of degradation ratio still exist, which may affect the accuracy of the acceleration model and the extrapolation value. Moreover, the conducted solution of this problem, Bayesian method, lose key information when unifying the degradation data. In this paper, a new data processing and parameter inference method based on Bayesian method is proposed to handle degradation data and solve the problems above. First, Wiener process and acceleration model is chosen; Second, the initial values of degradation model and parameters of prior and posterior distribution under each level is calculated with updating and iteration of estimation values; Third, the lifetime and reliability values are estimated on the basis of the estimation parameters; Finally, a case study is provided to demonstrate the validity of the proposed method. The results illustrate that the proposed method is quite effective and accuracy in estimating the lifetime and reliability of a product.

  16. A simple statistical method for catch comparison studies

    DEFF Research Database (Denmark)

    Holst, René; Revill, Andrew

    2009-01-01

    For analysing catch comparison data, we propose a simple method based on Generalised Linear Mixed Models (GLMM) and use polynomial approximations to fit the proportions caught in the test codend. The method provides comparisons of fish catch at length by the two gears through a continuous curve...... with a realistic confidence band. We demonstrate the versatility of this method, on field data obtained from the first known testing in European waters of the Rhode Island (USA) 'Eliminator' trawl. These data are interesting as they include a range of species with different selective patterns. Crown Copyright (C...

  17. A statistical comparison of accelerated concrete testing methods

    Directory of Open Access Journals (Sweden)

    Denny Meyer

    1997-01-01

    Full Text Available Accelerated curing results, obtained after only 24 hours, are used to predict the 28 day strength of concrete. Various accelerated curing methods are available. Two of these methods are compared in relation to the accuracy of their predictions and the stability of the relationship between their 24 hour and 28 day concrete strength. The results suggest that Warm Water accelerated curing is preferable to Hot Water accelerated curing of concrete. In addition, some other methods for improving the accuracy of predictions of 28 day strengths are suggested. In particular the frequency at which it is necessary to recalibrate the prediction equation is considered.

  18. Rationalizing method of replacement intervals by using Bayesian statistics

    International Nuclear Information System (INIS)

    Kasai, Masao; Notoya, Junichi; Kusakari, Yoshiyuki

    2007-01-01

    This study represents the formulations for rationalizing the replacement intervals of equipments and/or parts taking into account the probability density functions (PDF) of the parameters of failure distribution functions (FDF) and compares the optimized intervals by our formulations with those by conventional formulations which uses only representative values of the parameters of FDF instead of using these PDFs. The failure data are generated by Monte Carlo simulations since the real failure data can not be available for us. The PDF of PDF parameters are obtained by Bayesian method and the representative values are obtained by likelihood estimation and Bayesian method. We found that the method using PDF by Bayesian method brings longer replacement intervals than one using the representative of the parameters. (author)

  19. Comparative Analysis of Kernel Methods for Statistical Shape Learning

    National Research Council Canada - National Science Library

    Rathi, Yogesh; Dambreville, Samuel; Tannenbaum, Allen

    2006-01-01

    .... In this work, we perform a comparative analysis of shape learning techniques such as linear PCA, kernel PCA, locally linear embedding and propose a new method, kernelized locally linear embedding...

  20. Statistical Genetics Methods for Localizing Multiple Breast Cancer Genes

    National Research Council Canada - National Science Library

    Ott, Jurg

    1998-01-01

    .... For a number of variables measured on a trait, a method, principal components of heritability, was developed that combines these variables in such a way that the resulting linear combination has highest heritability...

  1. Statistics in science the foundations of statistical methods in biology, physics and economics

    CERN Document Server

    Costantini, Domenico

    1990-01-01

    An inference may be defined as a passage of thought according to some method. In the theory of knowledge it is customary to distinguish deductive and non-deductive inferences. Deductive inferences are truth preserving, that is, the truth of the premises is preserved in the con­ clusion. As a result, the conclusion of a deductive inference is already 'contained' in the premises, although we may not know this fact until the inference is performed. Standard examples of deductive inferences are taken from logic and mathematics. Non-deductive inferences need not preserve truth, that is, 'thought may pass' from true premises to false conclusions. Such inferences can be expansive, or, ampliative in the sense that the performances of such inferences actually increases our putative knowledge. Standard non-deductive inferences do not really exist, but one may think of elementary inductive inferences in which conclusions regarding the future are drawn from knowledge of the past. Since the body of scientific knowledge i...

  2. Groundwater vulnerability assessment: from overlay methods to statistical methods in the Lombardy Plain area

    Directory of Open Access Journals (Sweden)

    Stefania Stevenazzi

    2017-06-01

    Full Text Available Groundwater is among the most important freshwater resources. Worldwide, aquifers are experiencing an increasing threat of pollution from urbanization, industrial development, agricultural activities and mining enterprise. Thus, practical actions, strategies and solutions to protect groundwater from these anthropogenic sources are widely required. The most efficient tool, which helps supporting land use planning, while protecting groundwater from contamination, is represented by groundwater vulnerability assessment. Over the years, several methods assessing groundwater vulnerability have been developed: overlay and index methods, statistical and process-based methods. All methods are means to synthesize complex hydrogeological information into a unique document, which is a groundwater vulnerability map, useable by planners, decision and policy makers, geoscientists and the public. Although it is not possible to identify an approach which could be the best one for all situations, the final product should always be scientific defensible, meaningful and reliable. Nevertheless, various methods may produce very different results at any given site. Thus, reasons for similarities and differences need to be deeply investigated. This study demonstrates the reliability and flexibility of a spatial statistical method to assess groundwater vulnerability to contamination at a regional scale. The Lombardy Plain case study is particularly interesting for its long history of groundwater monitoring (quality and quantity, availability of hydrogeological data, and combined presence of various anthropogenic sources of contamination. Recent updates of the regional water protection plan have raised the necessity of realizing more flexible, reliable and accurate groundwater vulnerability maps. A comparison of groundwater vulnerability maps obtained through different approaches and developed in a time span of several years has demonstrated the relevance of the

  3. A non-parametric meta-analysis approach for combining independent microarray datasets: application using two microarray datasets pertaining to chronic allograft nephropathy

    Directory of Open Access Journals (Sweden)

    Archer Kellie J

    2008-02-01

    Full Text Available Abstract Background With the popularity of DNA microarray technology, multiple groups of researchers have studied the gene expression of similar biological conditions. Different methods have been developed to integrate the results from various microarray studies, though most of them rely on distributional assumptions, such as the t-statistic based, mixed-effects model, or Bayesian model methods. However, often the sample size for each individual microarray experiment is small. Therefore, in this paper we present a non-parametric meta-analysis approach for combining data from independent microarray studies, and illustrate its application on two independent Affymetrix GeneChip studies that compared the gene expression of biopsies from kidney transplant recipients with chronic allograft nephropathy (CAN to those with normal functioning allograft. Results The simulation study comparing the non-parametric meta-analysis approach to a commonly used t-statistic based approach shows that the non-parametric approach has better sensitivity and specificity. For the application on the two CAN studies, we identified 309 distinct genes that expressed differently in CAN. By applying Fisher's exact test to identify enriched KEGG pathways among those genes called differentially expressed, we found 6 KEGG pathways to be over-represented among the identified genes. We used the expression measurements of the identified genes as predictors to predict the class labels for 6 additional biopsy samples, and the predicted results all conformed to their pathologist diagnosed class labels. Conclusion We present a new approach for combining data from multiple independent microarray studies. This approach is non-parametric and does not rely on any distributional assumptions. The rationale behind the approach is logically intuitive and can be easily understood by researchers not having advanced training in statistics. Some of the identified genes and pathways have been

  4. Instrumental and statistical methods for the comparison of class evidence

    Science.gov (United States)

    Liszewski, Elisa Anne

    Trace evidence is a major field within forensic science. Association of trace evidence samples can be problematic due to sample heterogeneity and a lack of quantitative criteria for comparing spectra or chromatograms. The aim of this study is to evaluate different types of instrumentation for their ability to discriminate among samples of various types of trace evidence. Chemometric analysis, including techniques such as Agglomerative Hierarchical Clustering, Principal Components Analysis, and Discriminant Analysis, was employed to evaluate instrumental data. First, automotive clear coats were analyzed by using microspectrophotometry to collect UV absorption data. In total, 71 samples were analyzed with classification accuracy of 91.61%. An external validation was performed, resulting in a prediction accuracy of 81.11%. Next, fiber dyes were analyzed using UV-Visible microspectrophotometry. While several physical characteristics of cotton fiber can be identified and compared, fiber color is considered to be an excellent source of variation, and thus was examined in this study. Twelve dyes were employed, some being visually indistinguishable. Several different analyses and comparisons were done, including an inter-laboratory comparison and external validations. Lastly, common plastic samples and other polymers were analyzed using pyrolysis-gas chromatography/mass spectrometry, and their pyrolysis products were then analyzed using multivariate statistics. The classification accuracy varied dependent upon the number of classes chosen, but the plastics were grouped based on composition. The polymers were used as an external validation and misclassifications occurred with chlorinated samples all being placed into the category containing PVC.

  5. Non-Statistical Methods of Analysing of Bankruptcy Risk

    Directory of Open Access Journals (Sweden)

    Pisula Tomasz

    2015-06-01

    Full Text Available The article focuses on assessing the effectiveness of a non-statistical approach to bankruptcy modelling in enterprises operating in the logistics sector. In order to describe the issue more comprehensively, the aforementioned prediction of the possible negative results of business operations was carried out for companies functioning in the Polish region of Podkarpacie, and in Slovakia. The bankruptcy predictors selected for the assessment of companies operating in the logistics sector included 28 financial indicators characterizing these enterprises in terms of their financial standing and management effectiveness. The purpose of the study was to identify factors (models describing the bankruptcy risk in enterprises in the context of their forecasting effectiveness in a one-year and two-year time horizon. In order to assess their practical applicability the models were carefully analysed and validated. The usefulness of the models was assessed in terms of their classification properties, and the capacity to accurately identify enterprises at risk of bankruptcy and healthy companies as well as proper calibration of the models to the data from training sample sets.

  6. Comparison of Statistical Methods for Detector Testing Programs

    Energy Technology Data Exchange (ETDEWEB)

    Rennie, John Alan [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Abhold, Mark [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-10-14

    A typical goal for any detector testing program is to ascertain not only the performance of the detector systems under test, but also the confidence that systems accepted using that testing program’s acceptance criteria will exceed a minimum acceptable performance (which is usually expressed as the minimum acceptable success probability, p). A similar problem often arises in statistics, where we would like to ascertain the fraction, p, of a population of items that possess a property that may take one of two possible values. Typically, the problem is approached by drawing a fixed sample of size n, with the number of items out of n that possess the desired property, x, being termed successes. The sample mean gives an estimate of the population mean p ≈ x/n, although usually it is desirable to accompany such an estimate with a statement concerning the range within which p may fall and the confidence associated with that range. Procedures for establishing such ranges and confidence limits are described in detail by Clopper, Brown, and Agresti for two-sided symmetric confidence intervals.

  7. Data Analysis & Statistical Methods for Command File Errors

    Science.gov (United States)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  8. Modelación de episodios críticos de contaminación por material particulado (PM10 en Santiago de Chile: Comparación de la eficiencia predictiva de los modelos paramétricos y no paramétricos Modeling critical episodes of air pollution by PM10 in Santiago, Chile: Comparison of the predictive efficiency of parametric and non-parametric statistical models

    Directory of Open Access Journals (Sweden)

    Sergio A. Alvarado

    2010-12-01

    Full Text Available Objetivo: Evaluar la eficiencia predictiva de modelos estadísticos paramétricos y no paramétricos para predecir episodios críticos de contaminación por material particulado PM10 del día siguiente, que superen en Santiago de Chile la norma de calidad diaria. Una predicción adecuada de tales episodios permite a la autoridad decretar medidas restrictivas que aminoren la gravedad del episodio, y consecuentemente proteger la salud de la comunidad. Método: Se trabajó con las concentraciones de material particulado PM10 registradas en una estación asociada a la red de monitorización de la calidad del aire MACAM-2, considerando 152 observaciones diarias de 14 variables, y con información meteorológica registrada durante los años 2001 a 2004. Se ajustaron modelos estadísticos paramétricos Gamma usando el paquete estadístico STATA v11, y no paramétricos usando una demo del software estadístico MARS v 2.0 distribuida por Salford-Systems. Resultados: Ambos métodos de modelación presentan una alta correlación entre los valores observados y los predichos. Los modelos Gamma presentan mejores aciertos que MARS para las concentraciones de PM10 con valores Objective: To evaluate the predictive efficiency of two statistical models (one parametric and the other non-parametric to predict critical episodes of air pollution exceeding daily air quality standards in Santiago, Chile by using the next day PM10 maximum 24h value. Accurate prediction of such episodes would allow restrictive measures to be applied by health authorities to reduce their seriousness and protect the community´s health. Methods: We used the PM10 concentrations registered by a station of the Air Quality Monitoring Network (152 daily observations of 14 variables and meteorological information gathered from 2001 to 2004. To construct predictive models, we fitted a parametric Gamma model using STATA v11 software and a non-parametric MARS model by using a demo version of Salford

  9. Statistically Efficient Methods for Pitch and DOA Estimation

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll; Jensen, Søren Holdt

    2013-01-01

    , it was recently considered to estimate the DOA and pitch jointly. In this paper, we propose two novel methods for DOA and pitch estimation. They both yield maximum-likelihood estimates in white Gaussian noise scenar- ios, where the SNR may be different across channels, as opposed to state-of-the-art methods......Traditionally, direction-of-arrival (DOA) and pitch estimation of multichannel, periodic sources have been considered as two separate problems. Separate estimation may render the task of resolving sources with similar DOA or pitch impossible, and it may decrease the estimation accuracy. Therefore...

  10. Statistical methods for mass spectrometry-based clinical proteomics

    NARCIS (Netherlands)

    Kakourou, A.

    2018-01-01

    The work presented in this thesis focuses on methods for the construction of diagnostic rules based on clinical mass spectrometry proteomic data. Mass spectrometry has become one of the key technologies for jointly measuring the expression of thousands of proteins in biological samples.

  11. Statistical comparison of excystation methods in Cryptosporidium parvum oocysts

    Czech Academy of Sciences Publication Activity Database

    Pecková, R.; Stuart, P. D.; Sak, Bohumil; Květoňová, Dana; Kváč, Martin; Foitová, I.

    2016-01-01

    Roč. 230, OCT 30 (2016), s. 1-5 ISSN 0304-4017 R&D Projects: GA ČR(CZ) GAP505/11/1163 Institutional support: RVO:60077344 Keywords : Cryptosporidium parvum * excystation methods * in vitro cultivation * sodium hypochlorite * tlypsin Subject RIV: EG - Zoology Impact factor: 2.356, year: 2016

  12. Application of few-body methods to statistical mechanics

    International Nuclear Information System (INIS)

    Bolle, D.

    1981-01-01

    This paper reviews some of the methods to study the thermodynamic properties of a macroscopic system in terms of the scattering processes between the constituent particles in the system. In particular, we discuss the time delay approach to the virial expansion and the use of the arrangement channel quantum mechanics formulation in kinetic theory. (orig.)

  13. Oxygen Abundance Methods in SDSS: View from Modern Statistics ...

    Indian Academy of Sciences (India)

    6Occam's razor is a principle attributed to the 14th century English logician and .... knowledge of a galaxy's metallicity in order to locate it on the appropriate branch of ..... These methods try to balance the log likelihood term (lack of fit) with a.

  14. CAPABILITY ASSESSMENT OF MEASURING EQUIPMENT USING STATISTIC METHOD

    Directory of Open Access Journals (Sweden)

    Pavel POLÁK

    2014-10-01

    Full Text Available Capability assessment of the measurement device is one of the methods of process quality control. Only in case the measurement device is capable, the capability of the measurement and consequently production process can be assessed. This paper deals with assessment of the capability of the measuring device using indices Cg and Cgk.

  15. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models

    NARCIS (Netherlands)

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A.; van t Veld, Aart A.

    2012-01-01

    PURPOSE: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. METHODS AND MATERIALS: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator

  16. Comparison between statistical and optimization methods in accessing unmixing of spectrally similar materials

    CSIR Research Space (South Africa)

    Debba, Pravesh

    2010-11-01

    Full Text Available This paper reports on the results from ordinary least squares and ridge regression as statistical methods, and is compared to numerical optimization methods such as the stochastic method for global optimization, simulated annealing, particle swarm...

  17. Statistical Analysis Methods for the fMRI Data

    Directory of Open Access Journals (Sweden)

    Huseyin Boyaci

    2011-08-01

    Full Text Available Functional magnetic resonance imaging (fMRI is a safe and non-invasive way to assess brain functions by using signal changes associated with brain activity. The technique has become a ubiquitous tool in basic, clinical and cognitive neuroscience. This method can measure little metabolism changes that occur in active part of the brain. We process the fMRI data to be able to find the parts of brain that are involve in a mechanism, or to determine the changes that occur in brain activities due to a brain lesion. In this study we will have an overview over the methods that are used for the analysis of fMRI data.

  18. Nonparametric conditional predictive regions for time series

    NARCIS (Netherlands)

    de Gooijer, J.G.; Zerom Godefay, D.

    2000-01-01

    Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the

  19. Nonparametric estimation in models for unobservable heterogeneity

    OpenAIRE

    Hohmann, Daniel

    2014-01-01

    Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.

  20. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.; Lombard, F.

    2012-01-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal

  1. Panel data specifications in nonparametric kernel regression

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...

  2. Development of infill drilling recovery models for carbonates reservoirs using neural networks and multivariate statistical as a novel method

    International Nuclear Information System (INIS)

    Soto, R; Wu, Ch. H; Bubela, A M

    1999-01-01

    This work introduces a novel methodology to improve reservoir characterization models. In this methodology we integrated multivariate statistical analyses, and neural network models for forecasting the infill drilling ultimate oil recovery from reservoirs in San Andres and Clearfork carbonate formations in west Texas. Development of the oil recovery forecast models help us to understand the relative importance of dominant reservoir characteristics and operational variables, reproduce recoveries for units included in the database, forecast recoveries for possible new units in similar geological setting, and make operational (infill drilling) decisions. The variety of applications demands the creation of multiple recovery forecast models. We have developed intelligent software (Soto, 1998), oilfield intelligence (01), as an engineering tool to improve the characterization of oil and gas reservoirs. 01 integrates neural networks and multivariate statistical analysis. It is composed of five main subsystems: data input, preprocessing, architecture design, graphic design, and inference engine modules. One of the challenges in this research was to identify the dominant and the optimum number of independent variables. The variables include porosity, permeability, water saturation, depth, area, net thickness, gross thickness, formation volume factor, pressure, viscosity, API gravity, number of wells in initial water flooding, number of wells for primary recovery, number of infill wells over the initial water flooding, PRUR, IWUR, and IDUR. Multivariate principal component analysis is used to identify the dominant and the optimum number of independent variables. We compared the results from neural network models with the non-parametric approach. The advantage of the non-parametric regression is that it is easy to use. The disadvantage is that it retains a large variance of forecast results for a particular data set. We also used neural network concepts to develop recovery

  3. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying; Hart, Jeffrey D.; Genton, Marc G.

    2012-01-01

    the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both

  4. Zero- vs. one-dimensional, parametric vs. non-parametric, and confidence interval vs. hypothesis testing procedures in one-dimensional biomechanical trajectory analysis.

    Science.gov (United States)

    Pataky, Todd C; Vanrenterghem, Jos; Robinson, Mark A

    2015-05-01

    Biomechanical processes are often manifested as one-dimensional (1D) trajectories. It has been shown that 1D confidence intervals (CIs) are biased when based on 0D statistical procedures, and the non-parametric 1D bootstrap CI has emerged in the Biomechanics literature as a viable solution. The primary purpose of this paper was to clarify that, for 1D biomechanics datasets, the distinction between 0D and 1D methods is much more important than the distinction between parametric and non-parametric procedures. A secondary purpose was to demonstrate that a parametric equivalent to the 1D bootstrap exists in the form of a random field theory (RFT) correction for multiple comparisons. To emphasize these points we analyzed six datasets consisting of force and kinematic trajectories in one-sample, paired, two-sample and regression designs. Results showed, first, that the 1D bootstrap and other 1D non-parametric CIs were qualitatively identical to RFT CIs, and all were very different from 0D CIs. Second, 1D parametric and 1D non-parametric hypothesis testing results were qualitatively identical for all six datasets. Last, we highlight the limitations of 1D CIs by demonstrating that they are complex, design-dependent, and thus non-generalizable. These results suggest that (i) analyses of 1D data based on 0D models of randomness are generally biased unless one explicitly identifies 0D variables before the experiment, and (ii) parametric and non-parametric 1D hypothesis testing provide an unambiguous framework for analysis when one׳s hypothesis explicitly or implicitly pertains to whole 1D trajectories. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. A method for the statistical interpretation of friction ridge skin impression evidence: Method development and validation.

    Science.gov (United States)

    Swofford, H J; Koertner, A J; Zemp, F; Ausdemore, M; Liu, A; Salyards, M J

    2018-04-03

    The forensic fingerprint community has faced increasing amounts of criticism by scientific and legal commentators, challenging the validity and reliability of fingerprint evidence due to the lack of an empirically demonstrable basis to evaluate and report the strength of the evidence in a given case. This paper presents a method, developed as a stand-alone software application, FRStat, which provides a statistical assessment of the strength of fingerprint evidence. The performance was evaluated using a variety of mated and non-mated datasets. The results show strong performance characteristics, often with values supporting specificity rates greater than 99%. This method provides fingerprint experts the capability to demonstrate the validity and reliability of fingerprint evidence in a given case and report the findings in a more transparent and standardized fashion with clearly defined criteria for conclusions and known error rate information thereby responding to concerns raised by the scientific and legal communities. Published by Elsevier B.V.

  6. Statistics of electron multiplication in multiplier phototube: iterative method

    International Nuclear Information System (INIS)

    Grau Malonda, A.; Ortiz Sanchez, J.F.

    1985-01-01

    An iterative method is applied to study the variation of dynode response in the multiplier phototube. Three different situations are considered that correspond to the following ways of electronic incidence on the first dynode: incidence of exactly one electron, incidence of exactly r electrons and incidence of an average anti-r electrons. The responses are given for a number of steps between 1 and 5, and for values of the multiplication factor of 2.1, 2.5, 3 and 5. We study also the variance, the skewness and the excess of jurtosis for different multiplication factors. (author)

  7. Statistics of electron multiplication in a multiplier phototube; Iterative method

    International Nuclear Information System (INIS)

    Ortiz, J. F.; Grau, A.

    1985-01-01

    In the present paper an iterative method is applied to study the variation of dynode response in the multiplier phototube. Three different situation are considered that correspond to the following ways of electronic incidence on the first dynode: incidence of exactly one electron, incidence of exactly r electrons and incidence of an average r electrons. The responses are given for a number of steps between 1 and 5, and for values of the multiplication factor of 2.1, 2.5, 3 and 5. We study also the variance, the skewness and the excess of jurtosis for different multiplication factors. (Author) 11 refs

  8. Statistical inference methods for two crossing survival curves: a comparison of methods.

    Science.gov (United States)

    Li, Huimin; Han, Dong; Hou, Yawen; Chen, Huilin; Chen, Zheng

    2015-01-01

    A common problem that is encountered in medical applications is the overall homogeneity of survival distributions when two survival curves cross each other. A survey demonstrated that under this condition, which was an obvious violation of the assumption of proportional hazard rates, the log-rank test was still used in 70% of studies. Several statistical methods have been proposed to solve this problem. However, in many applications, it is difficult to specify the types of survival differences and choose an appropriate method prior to analysis. Thus, we conducted an extensive series of Monte Carlo simulations to investigate the power and type I error rate of these procedures under various patterns of crossing survival curves with different censoring rates and distribution parameters. Our objective was to evaluate the strengths and weaknesses of tests in different situations and for various censoring rates and to recommend an appropriate test that will not fail for a wide range of applications. Simulation studies demonstrated that adaptive Neyman's smooth tests and the two-stage procedure offer higher power and greater stability than other methods when the survival distributions cross at early, middle or late times. Even for proportional hazards, both methods maintain acceptable power compared with the log-rank test. In terms of the type I error rate, Renyi and Cramér-von Mises tests are relatively conservative, whereas the statistics of the Lin-Xu test exhibit apparent inflation as the censoring rate increases. Other tests produce results close to the nominal 0.05 level. In conclusion, adaptive Neyman's smooth tests and the two-stage procedure are found to be the most stable and feasible approaches for a variety of situations and censoring rates. Therefore, they are applicable to a wider spectrum of alternatives compared with other tests.

  9. Students' Attitudes toward Statistics across the Disciplines: A Mixed-Methods Approach

    Science.gov (United States)

    Griffith, James D.; Adams, Lea T.; Gu, Lucy L.; Hart, Christian L.; Nichols-Whitehead, Penney

    2012-01-01

    Students' attitudes toward statistics were investigated using a mixed-methods approach including a discovery-oriented qualitative methodology among 684 undergraduate students across business, criminal justice, and psychology majors where at least one course in statistics was required. Students were asked about their attitudes toward statistics and…

  10. Counting Better? An Examination of the Impact of Quantitative Method Teaching on Statistical Anxiety and Confidence

    Science.gov (United States)

    Chamberlain, John Martyn; Hillier, John; Signoretta, Paola

    2015-01-01

    This article reports the results of research concerned with students' statistical anxiety and confidence to both complete and learn to complete statistical tasks. Data were collected at the beginning and end of a quantitative method statistics module. Students recognised the value of numeracy skills but felt they were not necessarily relevant for…

  11. Multivariate statistical methods and data mining in particle physics (4/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  12. Multivariate statistical methods and data mining in particle physics (2/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  13. Multivariate statistical methods and data mining in particle physics (1/4)

    CERN Multimedia

    CERN. Geneva

    2008-01-01

    The lectures will cover multivariate statistical methods and their applications in High Energy Physics. The methods will be viewed in the framework of a statistical test, as used e.g. to discriminate between signal and background events. Topics will include an introduction to the relevant statistical formalism, linear test variables, neural networks, probability density estimation (PDE) methods, kernel-based PDE, decision trees and support vector machines. The methods will be evaluated with respect to criteria relevant to HEP analyses such as statistical power, ease of computation and sensitivity to systematic effects. Simple computer examples that can be extended to more complex analyses will be presented.

  14. Refining developmental coordination disorder subtyping with multivariate statistical methods

    Directory of Open Access Journals (Sweden)

    Lalanne Christophe

    2012-07-01

    Full Text Available Abstract Background With a large number of potentially relevant clinical indicators penalization and ensemble learning methods are thought to provide better predictive performance than usual linear predictors. However, little is known about how they perform in clinical studies where few cases are available. We used Random Forests and Partial Least Squares Discriminant Analysis to select the most salient impairments in Developmental Coordination Disorder (DCD and assess patients similarity. Methods We considered a wide-range testing battery for various neuropsychological and visuo-motor impairments which aimed at characterizing subtypes of DCD in a sample of 63 children. Classifiers were optimized on a training sample, and they were used subsequently to rank the 49 items according to a permuted measure of variable importance. In addition, subtyping consistency was assessed with cluster analysis on the training sample. Clustering fitness and predictive accuracy were evaluated on the validation sample. Results Both classifiers yielded a relevant subset of items impairments that altogether accounted for a sharp discrimination between three DCD subtypes: ideomotor, visual-spatial and constructional, and mixt dyspraxia. The main impairments that were found to characterize the three subtypes were: digital perception, imitations of gestures, digital praxia, lego blocks, visual spatial structuration, visual motor integration, coordination between upper and lower limbs. Classification accuracy was above 90% for all classifiers, and clustering fitness was found to be satisfactory. Conclusions Random Forests and Partial Least Squares Discriminant Analysis are useful tools to extract salient features from a large pool of correlated binary predictors, but also provide a way to assess individuals proximities in a reduced factor space. Less than 15 neuro-visual, neuro-psychomotor and neuro-psychological tests might be required to provide a sensitive and

  15. Energy demand forecasting method based on international statistical data

    International Nuclear Information System (INIS)

    Glanc, Z.; Kerner, A.

    1997-01-01

    Poland is in a transition phase from a centrally planned to a market economy; data collected under former economic conditions do not reflect a market economy. Final energy demand forecasts are based on the assumption that the economic transformation in Poland will gradually lead the Polish economy, technologies and modes of energy use, to the same conditions as mature market economy countries. The starting point has a significant influence on the future energy demand and supply structure: final energy consumption per capita in 1992 was almost half the average of OECD countries; energy intensity, based on Purchasing Power Parities (PPP) and referred to GDP, is more than 3 times higher in Poland. A method of final energy demand forecasting based on regression analysis is described in this paper. The input data are: output of macroeconomic and population growth forecast; time series 1970-1992 of OECD countries concerning both macroeconomic characteristics and energy consumption; and energy balance of Poland for the base year of the forecast horizon. (author). 1 ref., 19 figs, 4 tabs

  16. Energy demand forecasting method based on international statistical data

    Energy Technology Data Exchange (ETDEWEB)

    Glanc, Z; Kerner, A [Energy Information Centre, Warsaw (Poland)

    1997-09-01

    Poland is in a transition phase from a centrally planned to a market economy; data collected under former economic conditions do not reflect a market economy. Final energy demand forecasts are based on the assumption that the economic transformation in Poland will gradually lead the Polish economy, technologies and modes of energy use, to the same conditions as mature market economy countries. The starting point has a significant influence on the future energy demand and supply structure: final energy consumption per capita in 1992 was almost half the average of OECD countries; energy intensity, based on Purchasing Power Parities (PPP) and referred to GDP, is more than 3 times higher in Poland. A method of final energy demand forecasting based on regression analysis is described in this paper. The input data are: output of macroeconomic and population growth forecast; time series 1970-1992 of OECD countries concerning both macroeconomic characteristics and energy consumption; and energy balance of Poland for the base year of the forecast horizon. (author). 1 ref., 19 figs, 4 tabs.

  17. Statistical methods for the forensic analysis of striated tool marks

    Energy Technology Data Exchange (ETDEWEB)

    Hoeksema, Amy Beth [Iowa State Univ., Ames, IA (United States)

    2013-01-01

    In forensics, fingerprints can be used to uniquely identify suspects in a crime. Similarly, a tool mark left at a crime scene can be used to identify the tool that was used. However, the current practice of identifying matching tool marks involves visual inspection of marks by forensic experts which can be a very subjective process. As a result, declared matches are often successfully challenged in court, so law enforcement agencies are particularly interested in encouraging research in more objective approaches. Our analysis is based on comparisons of profilometry data, essentially depth contours of a tool mark surface taken along a linear path. In current practice, for stronger support of a match or non-match, multiple marks are made in the lab under the same conditions by the suspect tool. We propose the use of a likelihood ratio test to analyze the difference between a sample of comparisons of lab tool marks to a field tool mark, against a sample of comparisons of two lab tool marks. Chumbley et al. (2010) point out that the angle of incidence between the tool and the marked surface can have a substantial impact on the tool mark and on the effectiveness of both manual and algorithmic matching procedures. To better address this problem, we describe how the analysis can be enhanced to model the effect of tool angle and allow for angle estimation for a tool mark left at a crime scene. With sufficient development, such methods may lead to more defensible forensic analyses.

  18. Evaluation of Nonparametric Probabilistic Forecasts of Wind Power

    DEFF Research Database (Denmark)

    Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008

    Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...

  19. Statistical methods used in the public health literature and implications for training of public health professionals.

    Science.gov (United States)

    Hayat, Matthew J; Powell, Amanda; Johnson, Tessa; Cadwell, Betsy L

    2017-01-01

    Statistical literacy and knowledge is needed to read and understand the public health literature. The purpose of this study was to quantify basic and advanced statistical methods used in public health research. We randomly sampled 216 published articles from seven top tier general public health journals. Studies were reviewed by two readers and a standardized data collection form completed for each article. Data were analyzed with descriptive statistics and frequency distributions. Results were summarized for statistical methods used in the literature, including descriptive and inferential statistics, modeling, advanced statistical techniques, and statistical software used. Approximately 81.9% of articles reported an observational study design and 93.1% of articles were substantively focused. Descriptive statistics in table or graphical form were reported in more than 95% of the articles, and statistical inference reported in more than 76% of the studies reviewed. These results reveal the types of statistical methods currently used in the public health literature. Although this study did not obtain information on what should be taught, information on statistical methods being used is useful for curriculum development in graduate health sciences education, as well as making informed decisions about continuing education for public health professionals.

  20. Rank-based permutation approaches for non-parametric factorial designs.

    Science.gov (United States)

    Umlauft, Maria; Konietschke, Frank; Pauly, Markus

    2017-11-01

    Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal-Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. © 2017 The British Psychological Society.